Skip to content

Latest commit

 

History

History
106 lines (91 loc) · 3.93 KB

README.md

File metadata and controls

106 lines (91 loc) · 3.93 KB

License: MIT

Let's play llama.cpp


This branch is intended as a playground for educational and research purposes. I love llama.cpp and use it or deal with it every day - and it is quite self-explanatory that not every conceivable feature can be implemented or every user request realized. Not only would such an approach contradict the basic philosophy of llama.cpp, but the resources of the main developers are limited. Nonetheless, I personally would sometimes like to know what if ... what if we were to use a certain sampler that doesn't exist in llama.cpp? Or what if we extended the server with functions from the in-house ggml library, e.g. bert.cpp or sd.cpp?

And I think that llama.cpp is the best framework and backend for acquiring your own expertise in the areas of LLMs, neural networks, transformers etc. through learning by doing and self-study.

If it turns out that the added value of an extension goes beyond the mere satisfaction of curiosity and the urge to experiment, then it should be considered to create a clean code and a pull request from it so that the whole community could benefit from it.



Successfully completed and merged works


New Server UI


Some improvements regarding the aesthetics to make the server ui looks nicer and additionally some needed functionalities like the prompt format templates.



The new UI with 6 different color themes.






With attention for finer details.






Things I am currently working on


TUI to start the server


A shell script that utilizes the dialog tool to compose the right command you need to start the server.
It can also automatically find .gguf files on your computer and additionally save and load configurations/commands.



GUI to start the server


Same as above but this version utilizes yad or zenity to start a graphical interface.






Things that are worth it to be considered next


  • DRY (in progress).
    Very promising sampling method (todo: add url -> reddit.

  • Multilingualism (in progress).
    Select a language via a drop-down menu.

  • Speech to Text.
    Implement Interface for Whisper.cpp for STT

  • Vector Database.
    Implement Logic to Utilize Bert.cpp for efficient embeddings

  • Text to Speech.
    Waiting for a .cpp/ggml ecosystem TTS Solution ...

  • Extend UI.
    An additional (tab?)-View for Finetuning

  • Group-Chat.
    Simulated Multi- or Group-Chat


I would be happy for any feedbacks, advices, help and contributions. Feel free to contact me if you're interested in working together on those things.




Docs

If you are looking for support, I would recommend to referr to the original llama.cpp, with first considering the following: