Let's play llama.cpp

This branch is intended as a playground for educational and research purposes. I love llama.cpp and use it or deal with it every day - and it is quite self-explanatory that not every conceivable feature can be implemented or every user request realized. Not only would such an approach contradict the basic philosophy of llama.cpp, but the resources of the main developers are limited. Nonetheless, I personally would sometimes like to know what if ... what if we were to use a certain sampler that doesn't exist in llama.cpp? Or what if we extended the server with functions from the in-house ggml library, e.g. bert.cpp or sd.cpp?

And I think that llama.cpp is the best framework and backend for acquiring your own expertise in the areas of LLMs, neural networks, transformers etc. through learning by doing and self-study.

If it turns out that the added value of an extension goes beyond the mere satisfaction of curiosity and the urge to experiment, then it should be considered to create a clean code and a pull request from it so that the whole community could benefit from it.

Successfully completed and merged works

New Server UI

Some improvements regarding the aesthetics to make the server ui looks nicer and additionally some needed functionalities like the prompt format templates.

The new UI with 6 different color themes.

With attention for finer details.

Things I am currently working on

TUI to start the server

A shell script that utilizes the dialog tool to compose the right command you need to start the server.
It can also automatically find .gguf files on your computer and additionally save and load configurations/commands.

GUI to start the server

Same as above but this version utilizes yad or zenity to start a graphical interface.

Things that are worth it to be considered next

DRY (in progress).
Very promising sampling method (todo: add url -> reddit.

Multilingualism (in progress).
Select a language via a drop-down menu.

Speech to Text.
Implement Interface for Whisper.cpp for STT

Vector Database.
Implement Logic to Utilize Bert.cpp for efficient embeddings

Text to Speech.
Waiting for a .cpp/ggml ecosystem TTS Solution ...

Extend UI.
An additional (tab?)-View for Finetuning

Group-Chat.
Simulated Multi- or Group-Chat

I would be happy for any feedbacks, advices, help and contributions. Feel free to contact me if you're interested in working together on those things.

Docs

If you are looking for support, I would recommend to referr to the original llama.cpp, with first considering the following:

Name		Name	Last commit message	Last commit date
Latest commit History 3,162 Commits
.devops		.devops
.github		.github
ci		ci
cmake		cmake
common		common
docs		docs
examples		examples
ggml-cuda		ggml-cuda
gguf-py		gguf-py
grammars		grammars
kompute @ 4565194		kompute @ 4565194
kompute-shaders		kompute-shaders
media		media
models		models
pocs		pocs
prompts		prompts
requirements		requirements
scripts		scripts
spm-headers		spm-headers
tests		tests
.clang-tidy		.clang-tidy
.dockerignore		.dockerignore
.ecrc		.ecrc
.editorconfig		.editorconfig
.flake8		.flake8
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
AUTHORS		AUTHORS
CMakeLists.txt		CMakeLists.txt
CMakePresets.json		CMakePresets.json
LICENSE		LICENSE
Makefile		Makefile
Package.swift		Package.swift
README-sycl.md		README-sycl.md
README.md		README.md
SECURITY.md		SECURITY.md
codecov.yml		codecov.yml
convert-hf-to-gguf-update.py		convert-hf-to-gguf-update.py
convert-hf-to-gguf.py		convert-hf-to-gguf.py
convert-llama-ggml-to-gguf.py		convert-llama-ggml-to-gguf.py
flake.lock		flake.lock
flake.nix		flake.nix
ggml-alloc.c		ggml-alloc.c
ggml-alloc.h		ggml-alloc.h
ggml-backend-impl.h		ggml-backend-impl.h
ggml-backend.c		ggml-backend.c
ggml-backend.h		ggml-backend.h
ggml-common.h		ggml-common.h
ggml-cuda.cu		ggml-cuda.cu
ggml-cuda.h		ggml-cuda.h
ggml-impl.h		ggml-impl.h
ggml-kompute.cpp		ggml-kompute.cpp
ggml-kompute.h		ggml-kompute.h
ggml-metal.h		ggml-metal.h
ggml-metal.m		ggml-metal.m
ggml-metal.metal		ggml-metal.metal
ggml-quants.c		ggml-quants.c
ggml-quants.h		ggml-quants.h
ggml-rpc.cpp		ggml-rpc.cpp
ggml-rpc.h		ggml-rpc.h
ggml-sycl.cpp		ggml-sycl.cpp
ggml-sycl.h		ggml-sycl.h
ggml-vulkan-shaders.hpp		ggml-vulkan-shaders.hpp
ggml-vulkan.cpp		ggml-vulkan.cpp
ggml-vulkan.h		ggml-vulkan.h
ggml.c		ggml.c
ggml.h		ggml.h
ggml_vk_generate_shaders.py		ggml_vk_generate_shaders.py
llama.cpp		llama.cpp
llama.h		llama.h
mypy.ini		mypy.ini
pyrightconfig.json		pyrightconfig.json
requirements.txt		requirements.txt
sgemm.cpp		sgemm.cpp
sgemm.h		sgemm.h
unicode-data.cpp		unicode-data.cpp
unicode-data.h		unicode-data.h
unicode.cpp		unicode.cpp
unicode.h		unicode.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Let's play llama.cpp

Successfully completed and merged works

New Server UI

The new UI with 6 different color themes.

With attention for finer details.

Things I am currently working on

TUI to start the server

GUI to start the server

Things that are worth it to be considered next

Docs

About

Languages

License

mounta11n/plusplus-camall

Folders and files

Latest commit

History

Repository files navigation

Let's play llama.cpp

Successfully completed and merged works

New Server UI

The new UI with 6 different color themes.

With attention for finer details.

Things I am currently working on

TUI to start the server

GUI to start the server

Things that are worth it to be considered next

Docs

About

Resources

License

Security policy

Stars

Watchers

Forks

Languages