Building Telegram based RAG application: Part-2.
Developing systematically the LLM app using llamaindex template.

About/TLDR;
This series is the final of the 2 part blog of LLM application design series where i’ll develop each component services and then give description about how to have a developer mindset in implementing a usecase from the initial idea and tools to be used to the iterative development of the actual product. check out the first part of the series in understanding the workflow of building RAG application on top of the telegram chat data.
The code is open sourced here but WIP, and feel free to add a PR or any comments regarding the potential changes or issues you’re facing running this app.
🚧 Application development step:
Now delving straight into the development, i started with the pre-defined mock template by llama_index named RAGS (Github) and adding each component into the given app, we start with the parsing of the telegram data defined as follows:
1. Parsing the telegram Data:
We will start with building the util script that will parse the data from the telegram across the channels subscribed by the bot (using telethon library):
Here we’ve stored the data in dataframe consisting of the parameters
- Channel name, userId, Messages,MediaPath(for multimedia message) , timestamp .
- Along with the media messages in the folders in format <channels>/<userId>/<parameters>/<media_asset>.
Things to remember:
- There can be issues related to the telethon library, mainly the error due to database locking (due to the not disconnecting of the telegramClient object at the end of the dataset parsing) thus implement the function client.
⚠️ Insure that the api_hash and api_id are kept securely from accidently getting leaked . similarly to the web3 wallets these are one time and unique identity acting as private keys and they permit anyone to message on your behalf .
For simplyfying the process for users to query across the various channels , various accounts and their corresponding channels was challenging to be put all into a single dataframe, thus i took help of push-shift telegram repository that implements the ETL pipeline using telethon library to parse the files into 3 dataset: Channels, Accounts and messages . this way makes it easier to load the data using the traditional df loaders along with providing extensive information regarding the mapping between association of the various users to the corresponding channels and their messages.
2. Setting up the core service application of the RAG
We have the folllowing folders in the RAG’s repository:
- Pages: These consist of the code that define the overall UI of the application consisting of the following templates:
- Main page (1__Home__.py): This page consist of the selection of the agent profile, configuration, and the chat interface where you define the various steps of the data parsing , chunking for vector storage, indexation and the nature of the query (multimodal / single category of data). via chat interface you can create the prompts for the tool frameworks to set the various category of properties in order to
- RAG configuration (2__Config__.py): This page consist of the settings for the various steps of the RAG pipeline once the dataset is uploaded , users can then finetune the vector indexing parameters (top-K, chunking size, embeddings model etc).
- Generated RAG agent (3_Generated_RAG_Agent.py): is the final stage where the given agent is available for the users to provide the general queries and run queries based on the given dataset.
I just changed the description of the tasks that i wanted the chat agent to define the default system prompt , everything else is agnostic for the developers to build the query retrieval applications on top of any dataset.
2. Core: Its the set of libraries and scripts that define the tools and frameworks for defining agents (agent_builder), caching the results and the created agents (param_cache.py) parameters of their RAG pipeline and query template like Loading the agent (utils.py) consisting of the loaded and chunked dataset, history of the transactions, integration with the LLM inference API’s (openAPI) etc.
Finally: Steps for running the RAGS framework on telegram data
For the telegram dataset tutorial with RAG, i followed the given steps:
- setting up the openai / huggingface keys in .streamlit/secrets.toml.
- Set the settings on the main page:
- Selecting the option of getting the multimodal search active
- Define the various embedding models that you want to use for augmented training of the embedded model which you want to refer, the options can be : “Openai’s CLIP”, instructor_models etc.
2. Now Ask the chatbot to load the data from the given files . this is done by prompt: “load the data which is defined in the the directory : datas/*.json” ().
3. Then define the parameters of the of the vector indexation storage (for now it allows user to set parameters like top K score, chunk_size, embed_model etc). in our usecase for messages , K =5 , chunk_size being 512 (average message size) and taking embedding model as clip did the trick.
4. Then it asks for the system prompt that the agent has to be prompted defining its rules while analysing and generating results corresponding to the user queries to agent. its optional given that system prompt is rigorous enough for any use case to explain agent the boundations on how to use the tools to query the result while giving the consistent answer with sources.
5. And then wait for some time while the corresponding embedding model is set locally and related tokenizers is run on the chunked_data.
6. Tada !! 🎉🎉 , we have the whole working chat agent locally on your data, you can then ask queries regarding the categories of discussion that users is having, typical insights like which users are most engaging users etc. you will be able to access the history of the chats and results.


Troubleshoot F.A.Q:
This framework did had indeed some errors that were needed an update:
1.Lack of the selection of the multiple/multimodal embedding models apart from openAI: Currently there are more performant models that are available on the Huggingface which are not integrated in the RAGS pipeline. this is resolved by:
- Adding the huggingface API keys in the secrets.toml and provide the embed_models as the input as defined in the huggingfaceEmbedding class.
2. Issue of Process being killed while calling the create_agent() function: This error happens if the agent creation time takes much time than anticipated , due to which the streamlit application gets function timeout (here is the thread for more discussion), for now the only option is to use major instances in hosting the application, or building local optimised stack for model embeddigs (thanks to ollama framework).
Thanks for reading till now and like on the page . do indeed checkout and ⭐ the repo and article.