* Upgrade init_tensor API to return a ggml_status To prepare for an 'abort-free' ggml (ggml not to abort on OOMs but return a OOM status), as agreeed with Diego in the ggml repo, upgrade the init_tensor() and view_init() APIs to return a ggml_status. * misc fixes --------- Co-authored-by: slaren <slarengh@gmail.com>
		
			
				
	
	
	
		
			6.4 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	Pull requests (for contributors)
- llama.cpp uses the ggml tensor library for model evaluation. If you are unfamiliar with ggml, consider taking a look at the examples in the ggml repository. simple shows the bare minimum for using ggml. gpt-2 has minimal implementations for language model inference using GPT-2. mnist demonstrates how to train and evaluate a simple image classifier
 - Test your changes:
- Execute the full CI locally on your machine before publishing
 - Verify that the perplexity and the performance are not affected negatively by your changes (use 
llama-perplexityandllama-bench) - If you modified the 
ggmlsource, run thetest-backend-opstool to check whether different backend implementations of theggmloperators produce consistent results (this requires access to at least two differentggmlbackends) - If you modified a 
ggmloperator or added a new one, add the corresponding test cases totest-backend-ops 
 - Create separate PRs for each feature or fix. Avoid combining unrelated changes in a single PR
 - Consider allowing write access to your branch for faster reviews, as reviewers can push commits directly
 - If your PR becomes stale, don't hesitate to ping the maintainers in the comments
 
Pull requests (for collaborators)
- Squash-merge PRs
 - Use the following format for the squashed commit title: 
<module> : <commit title> (#<issue_number>). For example:utils : fix typo in utils.py (#1234) - Optionally pick a 
<module>from here: https://github.com/ggml-org/llama.cpp/wiki/Modules - Consider adding yourself to CODEOWNERS
 
Coding guidelines
- 
Avoid adding third-party dependencies, extra files, extra headers, etc.
 - 
Always consider cross-compatibility with other operating systems and architectures
 - 
Avoid fancy-looking modern STL constructs, use basic
forloops, avoid templates, keep it simple - 
Vertical alignment makes things more readable and easier to batch edit
 - 
Clean-up any trailing whitespaces, use 4 spaces for indentation, brackets on the same line,
void * ptr,int & a - 
Use sized integer types such as
int32_tin the public API, e.g.size_tmay also be appropriate for allocation sizes or byte offsets - 
Declare structs with
struct foo {}instead oftypedef struct foo {} foo- In C++ code omit optional 
structandenumkeyword whenever they are not necessary 
// OK llama_context * ctx; const llama_rope_type rope_type; // not OK struct llama_context * ctx; const enum llama_rope_type rope_type;(NOTE: this guideline is yet to be applied to the
llama.cppcodebase. New code should follow this guideline.) - In C++ code omit optional 
 - 
Try to follow the existing patterns in the code (indentation, spaces, etc.). In case of doubt use
clang-format(from clang-tools v15+) to format the added code - 
For anything not covered in the current guidelines, refer to the C++ Core Guidelines
 - 
Tensors store data in row-major order. We refer to dimension 0 as columns, 1 as rows, 2 as matrices
 - 
Matrix multiplication is unconventional:
C = ggml_mul_mat(ctx, A, B)meansC^T = A B^T \Leftrightarrow C = B A^T. 
Naming guidelines
- 
Use
snake_casefor function, variable and type names - 
Naming usually optimizes for longest common prefix (see https://github.com/ggml-org/ggml/pull/302#discussion_r1243240963)
// not OK int small_number; int big_number; // OK int number_small; int number_big; - 
Enum values are always in upper case and prefixed with the enum name
enum llama_vocab_type { LLAMA_VOCAB_TYPE_NONE = 0, LLAMA_VOCAB_TYPE_SPM = 1, LLAMA_VOCAB_TYPE_BPE = 2, LLAMA_VOCAB_TYPE_WPM = 3, LLAMA_VOCAB_TYPE_UGM = 4, LLAMA_VOCAB_TYPE_RWKV = 5, }; - 
The general naming pattern is
<class>_<method>, with<method>being<action>_<noun>llama_model_init(); // class: "llama_model", method: "init" llama_sampler_chain_remove(); // class: "llama_sampler_chain", method: "remove" llama_sampler_get_seed(); // class: "llama_sampler", method: "get_seed" llama_set_embeddings(); // class: "llama_context", method: "set_embeddings" llama_n_threads(); // class: "llama_context", method: "n_threads" llama_adapter_lora_free(); // class: "llama_adapter_lora", method: "free"- The 
get<action>can be omitted - The 
<noun>can be omitted if not necessary - The 
_contextsuffix of the<class>is optional. Use it to disambiguate symbols when needed - Use 
init/freefor constructor/destructor<action> 
 - The 
 - 
Use the
_tsuffix when a type is supposed to be opaque to the user - it's not relevant to them if it is a struct or anything elsetypedef struct llama_context * llama_context_t; enum llama_pooling_type llama_pooling_type(const llama_context_t ctx);(NOTE: this guideline is yet to be applied to the
llama.cppcodebase. New code should follow this guideline) - 
C/C++ filenames are all lowercase with dashes. Headers use the
.hextension. Source files use the.cor.cppextension - 
Python filenames are all lowercase with underscores
 - 
(TODO: abbreviations usage)
 
Preprocessor directives
- 
(TODO: add guidelines with examples and apply them to the codebase)
#ifdef FOO #endif // FOO 
Documentation
- Documentation is a community effort
 - When you need to look into the source code to figure out how to use an API consider adding a short summary to the header file for future reference
 - When you notice incorrect or outdated documentation, please update it
 
Resources
The Github issues, PRs and discussions contain a lot of information that can be useful to get familiar with the codebase. For convenience, some of the more important information is referenced from Github projects:
