0. 背景AI大模型爆發(fā)已經(jīng)一兩年的時(shí)間了,目前為止,相對成熟的應(yīng)用領(lǐng)域是在AI+搜索領(lǐng)域,像 Kimi Chat、百度App、New Bing等,都逐步擁有了此功能,用戶只需輸入想知道的事情,這些軟件會自動搜索網(wǎng)絡(luò)內(nèi)容,然后根據(jù)網(wǎng)絡(luò)內(nèi)容總結(jié)出最終的答案,大大減輕了用戶的檢索和分析的負(fù)擔(dān)。 前面,我也分析過這類AI檢索功能的背后原理,同時(shí)也從0實(shí)現(xiàn)了一個(gè)AI搜索工具,感興趣的可以去看下這篇文章:【AI大模型應(yīng)用開發(fā)】【綜合實(shí)戰(zhàn)】AI+搜索,手把手帶你實(shí)現(xiàn)屬于你的AI搜索引擎(附完整代碼)。 但是,自己實(shí)現(xiàn)的,終究只是個(gè)Demo,只是原理通了,但效果如何?可能用起來并不好。畢竟,AI大模型應(yīng)用的特點(diǎn)就是:上手簡單,落地難。要想實(shí)現(xiàn)效果好的產(chǎn)品,還需要大量的細(xì)節(jié)處理和打磨。 最近,我發(fā)現(xiàn)了一個(gè)開源的AI搜索工具,叫 Lepton Search,GitHub Star數(shù)7.5K,還挺受歡迎的。本文我們來看下它的具體實(shí)現(xiàn),看下與我之前的思路有沒有區(qū)別,有沒有其它值得借鑒的地方。 1. Lepton Search 工具介紹在線體驗(yàn)地址:https://search./ GitHub 源碼地址:https://search./
界面還是挺簡潔的。 搜索后答案界面如下: 它會列出最終答案、引用的鏈接來源,以及聯(lián)想一些用戶可能會問的相關(guān)問題。 2. 實(shí)現(xiàn)原理我們這里不討論其界面前端的實(shí)現(xiàn),只看后端的實(shí)現(xiàn)原理。其后端的實(shí)現(xiàn)核心代碼大概有400多行,在 https://github.com/leptonai/search_with_lepton/blob/main/search_with_lepton.py 文件中。 2.1 總結(jié)先說結(jié)論,其實(shí)現(xiàn)原理與我之前的文章寫的實(shí)現(xiàn)原理差別不大,即首先利用檢索接口檢索出相關(guān)的網(wǎng)頁和文本內(nèi)容,然后以這些文本內(nèi)容作為RAG的參考文本,與原始問題一同給到大模型,大模型根據(jù)參考文本給出最終答案。 說白了,就是一個(gè)RAG的應(yīng)用,只是數(shù)據(jù)源來源不同而已。 2.2 重點(diǎn)代碼分析2.2.1 檢索數(shù)據(jù)源該項(xiàng)目可以使用不同的檢索數(shù)據(jù)源,例如Google,Bing等,有現(xiàn)成的代碼可以用。當(dāng)然,要自己去申請相應(yīng)接口的Key。 具體可直接用的不同檢索API的函數(shù)定義如下: def search_with_bing(query: str, subscription_key: str):
def search_with_google(query: str, subscription_key: str, cx: str):
def search_with_serper(query: str, subscription_key: str):
def search_with_searchapi(query: str, subscription_key: str):
2.2.2 檢索入口函數(shù)query_function 為該項(xiàng)目的檢索入口函數(shù)。主要代碼如下: def query_function( self, query: str, search_uuid: str, generate_related_questions: Optional[bool] = True, )->StreamingResponse:
if self.backend =="LEPTON": # delegate to the lepton search api. result = self.leptonsearch_client.query( query=query, search_uuid=search_uuid, generate_related_questions=generate_related_questions, ) returnStreamingResponse(content=result, media_type="text/html")
# First, do a search query. query = query or _default_query ...... contexts = self.search_function(query)
system_prompt = _rag_query_text.format( context="\n\n".join( [f"[[citation:{i+1}]] {c['snippet']}"for i, c inenumerate(contexts)] ) ) try: client = self.local_client() llm_response = client.chat.completions.create( model=self.model, messages=[ {"role":"system","content": system_prompt}, {"role":"user","content": query}, ], max_tokens=1024, stop=stop_words, stream=True, temperature=0.9, ) if self.should_do_related_questions and generate_related_questions: # While the answer is being generated, we can start generating # related questions as a future. related_questions_future = self.executor.submit( self.get_related_questions, query, contexts ) else: related_questions_future =None exceptExceptionas e: ......
以上代碼主要做了以下幾件事,也是AI搜索引擎的常規(guī)步驟: (1)contexts = self.search_function(query) 檢索相關(guān)文本 (2)system_prompt 組裝RAG Prompt,Prompt模板如下: _rag_query_text = """ You are a large language AI assistant built by Lepton AI. You are given a user question, and please write clean, concise and accurate answer to the question. You will be given a set of related contexts to the question, each starting with a reference number like [[citation:x]], where x is a number. Please use the context and cite the context at the end of each sentence if applicable.
Your answer must be correct, accurate and written by an expert using an unbiased and professional tone. Please limit to 1024 tokens. Do not give any information that is not related to the question, and do not repeat. Say "information is missing on" followed by the related topic, if the given context do not provide sufficient information.
Please cite the contexts with the reference numbers, in the format [citation:x]. If a sentence comes from multiple contexts, please list all applicable citations, like [citation:3][citation:5]. Other than code and specific names and citations, your answer must be written in the same language as the question.
Here are the set of contexts:
{context}
Remember, don't blindly repeat the contexts verbatim. And here is the user question: """
(3)client.chat.completions.create 調(diào)用大模型獲取答案 以上3步為基本步驟。該項(xiàng)目還增加了額外的步驟,獲取相關(guān)的問題。 2.2.3 獲取相關(guān)問題獲取相關(guān)問題展示給用戶的能力在某些情況下也是有用和有意義的,給用戶提示,在用戶不知道該如何問的時(shí)候有靈感。 其實(shí)現(xiàn)方法如下: def get_related_questions(self, query, contexts): ......
try: response = self.local_client().chat.completions.create( model=self.model, messages=[ { "role":"system", "content": _more_questions_prompt.format( context="\n\n".join([c["snippet"]for c in contexts]) ), }, { "role":"user", "content": query, }, ], tools=[{ "type":"function", "function": tool.get_tools_spec(ask_related_questions), }], max_tokens=512, ) ......
具體實(shí)現(xiàn)原理也是利用大模型,根據(jù)原始問題和回復(fù)的問題答案來生成幾個(gè)相關(guān)問題。通過其Prompt可以很容易看出其實(shí)現(xiàn)方式: _more_questions_prompt = """ You are a helpful assistant that helps the user to ask related questions, based on user's original question and the related contexts. Please identify worthwhile topics that can be follow-ups, and write questions no longer than 20 words each. Please make sure that specifics, like events, names, locations, are included in follow up questions so they can be asked standalone. For example, if the original question asks about "the Manhattan project", in the follow up question, do not just say "the project", but use the full name "the Manhattan project". Your related questions must be in the same language as the original question.
Here are the contexts of the question:
{context}
Remember, based on the original question and related contexts, suggest three such further questions. Do NOT repeat the original question. Each related question should be no longer than 20 words. Here is the original question: """
|