ask_youtube_playlists.question_answering package
Submodules
ask_youtube_playlists.question_answering.extractive module
Contains the functionality to perform extractive question answering.
- ask_youtube_playlists.question_answering.extractive.get_extractive_answer(question: str, context: str, model_name: str = 'deepset/roberta-base-squad2') str [source]
Returns the answer to a question using extractive question answering.
- Parameters:
question (str) – The question.
context (str) – The context.
model_name (str, optional) – The model name. Defaults to “deepset/roberta-base-squad2”.
- Returns:
A dictionary with the ‘answer’ as a string, the ‘score’ as a float and the ‘start’ and ‘end’ as integers.
ask_youtube_playlists.question_answering.generative module
Contains the functionality to answer a question using generative models.
- class ask_youtube_playlists.question_answering.generative.LLMSpec(model_name: str, model_type: str, max_tokens: int)[source]
Bases:
object
Class to store the information of a language model.
- model_name
The name of the language model.
- Type:
str
- model_type
The class or method used to load the language model.
- Type:
str
- max_tokens: int
- model_name: str
- model_type: str
- ask_youtube_playlists.question_answering.generative.get_generative_answer(question: str, relevant_documents: List[Document], model_name: str, temperature: int, max_length: int) str [source]
Returns the answer to the question as a string.
- Parameters:
question (str) – The question asked by the user.
relevant_documents (List[Document]) – The list of relevant documents.
model_name (str) – The name of the language model.
temperature (float) – The temperature used to generate the answer.
max_length (int) – The maximum length of the generated answer.
- Returns:
The answer to the question.
- Return type:
str
- ask_youtube_playlists.question_answering.generative.get_model_spec(model_name: str) LLMSpec [source]
Returns the language model specification.
- Parameters:
model_name (str) – The name of the language model.
- Returns:
The language model specification.
- Return type:
- Raises:
ValueError – If the language model is not available.
- ask_youtube_playlists.question_answering.generative.load_model(model_name: str, temperature: float = 0.7, max_length: int = 1024) BaseLLM [source]
Loads the language model.
- Parameters:
model_name (str) – The language model name.
temperature (float, optional) – The temperature used to generate the answer. The higher the temperature, the more “creative” the answer will be. Defaults to 0.7.
max_length (int, optional) – The maximum length of the generated answer. Defaults to 128.
- Returns:
The language model.
- Return type:
llms.base.BaseLLM
ask_youtube_playlists.question_answering.retriever module
Contains the functionality used to retrieve the most relevant documents for a given question.
- class ask_youtube_playlists.question_answering.retriever.DocumentInfo(document: Document, score: float, playlist_name: str)[source]
Bases:
NamedTuple
Class to store information about a document.
- document
The document text or content.
- Type:
langchain.schema.Document
- score
The relevance score of the document. The higher the score, the more relevant the document is. It is in the range [0, 1].
- Type:
float
- playlist_name
The name of the playlist to which the document belongs.
- Type:
str
- document: Document
Alias for field number 0
- playlist_name: str
Alias for field number 2
- score: float
Alias for field number 1
- class ask_youtube_playlists.question_answering.retriever.Retriever(retriever_directory: Path, config_filename: str = 'hyperparams.yaml')[source]
Bases:
object
Class to retrieve the most relevant documents for a given question.
- static cosine_distance(question_embedding: ndarray, document_embedding: ndarray) float [source]
Calculates the cosine distance between two vectors.
- Parameters:
question_embedding (np.ndarray) – The embedding of the question.
document_embedding (np.ndarray) – The embedding of the document.
- Returns:
The cosine distance between the two vectors.
- Return type:
float
- classmethod retrieve(retrievers: List[Retriever], question: str, n_documents: int) List[DocumentInfo] [source]
Retrieves the most relevant documents with their score and the playlist they belong to.
This function retrieves documents in two steps:
Extracts the most relevant documents from each retriever in
2. Ranks the retrieved documents from all retrievers and returns the most relevant ones, in addition to their score and the playlist they belong to.
- Parameters:
retrievers (List[Retriever]) – A list of retrievers.
question (str) – The question posed by the user.
n_documents (int) – The number of documents to retrieve.
- Returns:
A list of named tuples, each containing the document, its score and the playlist it belongs to. The list is sorted in descending order by relevance score.
- Return type:
list
- retrieve_from_playlist(question: str, n_documents: int) List[DocumentInfo] [source]
Retrieves the most relevant documents with their relevance score.
- Parameters:
question (str) – The question posed by the user.
n_documents (int) – The number of documents to retrieve.
- property total_number_of_documents: int
Returns the total number of documents.
Module contents
Implements the question answering system.
It consists of three components: 1.- Retrieval: This component retrieves the most relevant documents for a given question.
2.- Extractive: This component extracts the most relevant sentences from the retrieved documents.
3.- Generative: This component generates an answer to the question from the extracted sentences.