Let's play llama.cpp

This branch is intended as a playground for educational and research purposes. I love llama.cpp and use it or deal with it every day - and it is quite self-explanatory that not every conceivable feature can be implemented or every user request realized. Not only would such an approach contradict the basic philosophy of llama.cpp, but the resources of the main developers are limited. Nonetheless, I personally would sometimes like to know what if ... what if we were to use a certain sampler that doesn't exist in llama.cpp? Or what if we extended the server with functions from the in-house ggml library, e.g. bert.cpp or sd.cpp?

And I think that llama.cpp is the best framework and backend for acquiring your own expertise in the areas of LLMs, neural networks, transformers etc. through learning by doing and self-study.

If it turns out that the added value of an extension goes beyond the mere satisfaction of curiosity and the urge to experiment, then it should be considered to create a clean code and a pull request from it so that the whole community could benefit from it.

Successfully completed and merged works

New Server UI

Some improvements regarding the aesthetics to make the server ui looks nicer and additionally some needed functionalities like the prompt format templates.

The new UI with 6 different color themes.

With attention for finer details.

Things I am currently working on

TUI to start the server

A shell script that utilizes the dialog tool to compose the right command you need to start the server.
It can also automatically find .gguf files on your computer and additionally save and load configurations/commands.

GUI to start the server

Same as above but this version utilizes yad or zenity to start a graphical interface.

Things that are worth it to be considered next

DRY (in progress).
Very promising sampling method (todo: add url -> reddit.

Multilingualism (in progress).
Select a language via a drop-down menu.

Speech to Text.
Implement Interface for Whisper.cpp for STT

Vector Database.
Implement Logic to Utilize Bert.cpp for efficient embeddings

Text to Speech.
Waiting for a .cpp/ggml ecosystem TTS Solution ...

Extend UI.
An additional (tab?)-View for Finetuning

Group-Chat.
Simulated Multi- or Group-Chat

I would be happy for any feedbacks, advices, help and contributions. Feel free to contact me if you're interested in working together on those things.

Docs

If you are looking for support, I would recommend to referr to the original llama.cpp, with first considering the following:

General Guide
main
server
Performance troubleshooting
GGML tips & tricks
GBNF grammars

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Let's play llama.cpp

Successfully completed and merged works

New Server UI

The new UI with 6 different color themes.

With attention for finer details.

Things I am currently working on

TUI to start the server

GUI to start the server

Things that are worth it to be considered next

Docs

Files

README.md

Latest commit

History

README.md

File metadata and controls

Let's play llama.cpp

Successfully completed and merged works

New Server UI

The new UI with 6 different color themes.

With attention for finer details.

Things I am currently working on

TUI to start the server

GUI to start the server

Things that are worth it to be considered next

Docs