AULA 12:
Agentes + Python
AULA 11:
Agentes + Dados
Forms
Mapeamento do processo de avaliação de vídeos
Arquivo
XLSX
Coluna F
Segunda linha
Links vídeos
Critérios de avaliação
Relatório
Avaliações
Arquivo
XLSX
Página
SITE
Obtenção de dados
Encodings
Chunks
Word embeddings are numerical representations of a word’s meaning. They are formed based on the assumption that meaning is contextual. That is, a word’s meaning is dependant on its neighbors:
Creating vector embeddings
Embeddings translate the complexities of human language to a format that computers can understand. It uses neural networks to assign numerical values to the input data, in a way that similar data has similar values.
What is Word Embedding?
Word Embedding is a language modeling technique for mapping words to vectors of real numbers. It represents words or phrases in vector space with several dimensions. Word embeddings can be generated using various methods like neural networks, co-occurrence matrices, probabilistic models, etc. Word2Vec consists of models for generating word embedding. These models are shallow two-layer neural networks having one input layer, one hidden layer, and one output layer.
Word2Vec is a widely used method in natural language processing (NLP) that allows words to be represented as vectors in a continuous vector space. Word2Vec is an effort to map words to high-dimensional vectors to capture the semantic relationships between words, developed by researchers at Google. Words with similar meanings should have similar vector representations, according to the main principle of Word2Vec. Word2Vec utilizes two architectures:
Word embeddings are represented as mathematical vectors. This representation enables to perform standard mathematical operations in words, like addition and subtraction.
Embeddings
What are Vector Embeddings?
Embeddings are numerical machine learning representations of the semantic of the input data. They capture the meaning of complex, high-dimensional data, like text, images, or audio, into vectors. Enabling algorithms to process and analyze the data more efficiently.
How do embeddings work?
The quality of the vector representations drives the performance. The embedding model that works best for you depends on your use case.
Word Embeddings: Example
Neural Networks
Word2Vec
understand the context
nuances of the word “right” are blended
BERT
use of a word in its surroundings
https://colab.research.google.com/drive/1P44CeGMe9sOAhslI6bEWotBD281SKeNo?usp=sharing
https://medium.com/@manansuri/a-dummys-guide-to-word2vec-456444f3c673
A Dummy’s Guide to Word2Vec
FAISS
Situação
FAISS
São definidos os nomes dos arquivos denso_vectors_file para armazenar vetores densos e faiss_index_file para armazenar o índice Faiss.
IndexFlatL2
IndexFlatL2
Mede a distância L2 (ou euclidiana) entre todos os pontos dados entre nosso vetor de consulta e os vetores carregados no índice.
É simples, muito preciso, mas não muito rápido.
Mais métodos: Velocidade ou precisão?
Mais métodos: Velocidade ou precisão?
Ainda que um método seja mais rápido que outro, devemos também tomar nota dos resultados ligeiramente diferentes retornados. Anteriormente, com nossa pesquisa exaustiva em L2, estávamos retornando 7460, 10940, 3781 e 5747. Agora, vemos uma ordem de resultados ligeiramente diferente – e dois IDs diferentes, 5013 e 5370.
ChromaDB
AULA 13:
Chatbots + conteúdos