C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | AA | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | fill out the NEW survey here | ||||||||||||||||||||||||
2 | What is your position/title at your company? | How big is your organization? (number of employees) | Are you using LLM at in your organization? | What is your use case/use cases? | Have you integrated or built any internal tools to support LLMs in your org? If so what? | What are some of the main challenges you have encountered thus far when building with LLMs | What are your main concerns with using LLMs in production? | how are you using LLMs? | What tools are you using with LLMs? | What areas are you most interested in learning more about? | How do you deal with reliability of output? | Any stories you have that are worth sharing about working with LLMs? | Any questions you have for the community about LLM in production? | What is the main reason for not using LLMs in your org? | What are some key questions you face when it comes to using LLM in prod? | have you tried LLMs for different use cases in your org? | If yes, why did it not work out? | Any stories you have that are worth sharing about working with LLMs? | Any questions you have for the community about LLM in production? | ||||||
3 | Data scientist | 1,000+ | No | Production takes investment | Cost to maintain the service | Yes | Date quality | N/a | |||||||||||||||||
4 | ML Engineer | 50-500 | Yes | Text summarisation | No | Current infrastructure not designed for such massive models, having to implement workarounds for quick fixes | Cost | Open source model (GPT-J, etc) | Seldon | Inferenece for LLMs | |||||||||||||||
5 | Founder | 1-10 | Yes | Data annotation; summarization; search | https://chat.mantisnlp.com, internal annotation tools | out of date info, hallucinations, cost, difficulty in deploying on own infrastructure | they could serve nonsense or out of date information | Open AI API | Working with OSS LLMs | we provide links to the context from which answers have been drawn. | |||||||||||||||
6 | Owner | 1-10 | Yes | Content production and brainstorming | Not yet | not yet | cost of openai API | Open AI API | Looking at Pinecone, weaviate, langchain, hugging face | Embeddings | temperature = 0 | ||||||||||||||
7 | Senior Principal Scientist | 1,000+ | Yes | Healthcare | yes, confidential | reliability | susceptibility towards random generation | In house model | Working with OSS LLMs | ||||||||||||||||
8 | Head of Machine Learning | 50-500 | No | It's only just now becoming feasible with the ChatGPT API. Before it was too expensive and quality was not good enough. | How to make them reliable. | We are still exploring, eg for support in diagnosis of plant diseases. | So far it's not reliable enough and we don't have spent enough time preparing special content for it. Like rules it should use during diagnosis. The data ChatGPT was trained on seems a bit hit and miss for us. | Anyone have some experience with getting better reliability out of ChatGPT? Anyone succeeded in fine-tuning and getting better quality than ChatGPT? Thinking of llama for example. How can you limit ChatGPT so users will not ask it totally random stuff? | |||||||||||||||||
9 | Senior Machine Learning Engineer | 1,000+ | Yes | Search, classification, NER | Just a bunch of shitty scripts I need to refactor in the future. | Size in all regards. Also training time, response time and that multi gpu debugging is weird. | Whether the cost involved for us justifies the gains we get, considering we mainly make money from ads and not a really good product. | Other model provider API | Hugginface, sagemaker, openapi api, tensorflow. | Working with OSS LLMs | Ill refer to the picture of the xkcd guy "just stir the model until the numbers look right". | I am not sure what ranks as a LLM. Anything bert and beyond? Anyway, we once used setfit for a production task but found that performance was really tricky to debug. The proof of concept was fine. Production was a complete mess. To this day I dont know why- underlying hardware? | I mainly wonder how everyone is monitoring these. ALSO. For my company GDPR is a big deal. My DPO shoots down our ideas more often than not or limits it to using the last three months of data. How do other people deal with this? | ||||||||||||
10 | Head of Product | 10-50 | No | No applications | ROI of more expensive models | Yes | No ROI | N/A | Interested in results of survey | ||||||||||||||||
11 | 50-500 | Yes | oi | lik | il | Open AI API | loi | Working with OSS LLMs | lo | lo | lo | ||||||||||||||
12 | Senior Data Engineer | 50-500 | Yes | text embeddings for search, text generation to help the business people do their job | Yes, we're using OpenAI for a couple of internal tools. | Cost. But our issues with productionisation is more general than just for LLMs | Open AI API | streamlit | Inferenece for LLMs | All internal tools, reliability is not a major concern for now | |||||||||||||||
13 | Machine learning engineer | 1,000+ | Yes | Natural Language interface, domain specific content creation | Prompt debugging! We need a new form of debugger. | They are erratic | Open AI API | Working with OSS LLMs | We have control systems | ||||||||||||||||
14 | Senior Data Scientist | 1,000+ | Yes | Chatbots and Text classification | One of our use cases is to build text classification models for each of our customers (we are a B2B company). Achieved this using LLMs and some engineering | Open source model (GPT-J, etc) | Embeddings | ||||||||||||||||||
15 | ML Engineer | 50-500 | No | ||||||||||||||||||||||
16 | Snr MLOps Engineer | 50-500 | Yes | Adapters | Yes, training setup as well as inference setup | Latency and deployment time | Resources usage along with latency and deployment time | Open source model (GPT-J, etc) | Inferenece for LLMs | Assessment at evaluation | |||||||||||||||
17 | 1,000+ | Yes | Inferenece for LLMs | ||||||||||||||||||||||
18 | Senior ML engineer | 50-500 | No | Two main reasons: 1. Very compute & power hungry 2. Cannot be relied upon as it will very confidently return incorrect information or in “existent facts” which is problematic for use in my industry. Eg it took us less than 3mn to get BioGPT to tell us vaccines cause autism | What uses cases does it have for our business? how much gain do we get from using an LLM vs a simpler language model where we can control fact and outputs? How much will it cost (to deploy, and run) | We abandoned the idea pretty quickly | We got terrible suggestions from the LLM that were false information and would be detrimental to the business | Ask BioGPT about vaccines and autism and it’ll confidently tell you that vaccines cause autism spectrum disorders in a small percentage of children (yikes) | How do you force an LLM to only return factual information ? | ||||||||||||||||
19 | Machine Learning Engineer | 50-500 | No | We use medium-sized language models (e.g. RoBERTa-base) as the backbone for our deployed NLP models because they can perform inference on CPU, which is cheap. We don't use LLMs (e.g. ChatGPT, GPT-3) because the APIs are expensive, and fine-tuning them to perform NLP tasks other than language generation (e.g. NER, open-set intent classification) is not straightforward. | How can we ensure the same version of the model is being used? How can we supply a random seed to get deterministic results? How can we minimize LLM API usage costs? How can we fine-tune a LLM on a niche task without having to replicate the weights of the LLM? How can we deploy our own copy of an open-source LLM without breaking the bank? | Yes | We evaluated a LLM (GPT-3) against RoBERTa fine-tuned on our small task-specific dataset, and our fine-tuned model got much higher accuracy than GPT-3 did. | None | |||||||||||||||||
20 | 1,000+ | No | Not much usecases for us | ||||||||||||||||||||||
21 | Founding Data Scientist | 1-10 | Yes | Summarization, topic extraction, root-cause analysis | Not much, just some prompt storage and an automated retries to deal with different failure cases. | Hallucinations are a big one. In general, problems are hard to debug without careful manual review, so we're only using them when the amount of results generated is small enough to review by hand. | Reliability, cost, injecting domain-specific language and understanding into them | Open AI API | Embeddings | Automated retries for some simple & obvious failures, manual review for everything else | Where are others drawing the line between training models locally and using LLMs? Have others had success with using LLMs to bootstrap training data for smaller local models? | ||||||||||||||
22 | CTO | 1-10 | Yes | Marketing | Not yet | Still developing | Cost | Open AI API | Fine tunning LLMS | ||||||||||||||||
23 | Machine Learning Engineer | 500-1,000 | No | Not applicable to the domain | Not yet | Can you actually compete with the giants for a good language model? | |||||||||||||||||||
24 | Head of AI | 1,000+ | Yes | Scanning large text corpora using an LLM and some cases with text summary for now | Not (yet). | It’s a mix of taming a beast and using alchemy | Consistency of the output quality | Other model provider API | ./. | Working with OSS LLMs | HITL | They take every Input equally serious which makes for some great laughs… | |||||||||||||
25 | Product | 1,000+ | No | Production LLM lifecycle and use cases not clear | Data governance | no | |||||||||||||||||||
26 | Lead Data Scientist | 50-500 | No | Cost / right use-case | |||||||||||||||||||||
27 | Co-founder | 1-10 | Yes | AI for e-commerce | No | Hardware requirements and AI accelerators compatibility | Inference scalability | Open source model (GPT-J, etc) | PyTorch, AWS SageMaker, OpenAI's API, Midjourney, CLIP/BLIP/etc models | Inferenece for LLMs | Any opinions regarding the newer AI accelerator chips? Many people fighting for GPUs, but not necessarily mentioning AWS Inferentia 2 chips, TPUs, Habana Gaudi, etc. | ||||||||||||||
28 | MLE | 50-500 | Yes | dbt pre-processing to get the text right for the prompts and test it | explainability, expense, embeddings | see above | Open AI API | metaflow | Embeddings | ||||||||||||||||
29 | Snr MLE | 50-500 | No | Exploration stage | Putting company data in the right structure and updating the model with it | yes | getting fine tune right and LLMops pipeline | Is it cheaper to host your own LLM on AWS or just use OpenAI services? With respect to scaling and potential user base of 5000 | |||||||||||||||||
30 | Director of Artificial Intelligence | 1,000+ | Yes | Novelty - ChatGPT is being explored to optimize code, critique written works, etc. | No | N/A | Complacency and atrophy of critical thinking | Open source model (GPT-J, etc) | Microsoft Azure OpenAI services | Fine tunning LLMS | Educate others on the limitations and proper use. | One of our citizen data scientists now (figuratively) "worships" ChatGPT Gods given its capabilities. LOL. | None at the moment given our priorities are mostly with computer vision and time-series data analysis. | ||||||||||||
31 | Data Scientist | 50-500 | Yes | Summarisation, Email generation, Information retrieval | Modular python library for interacting with multiple LLM providers/APIs and a standardized prompt generation framework. | Consistency of formatting in freeform LLM text generation. Subtleties in prompt engineering. | Detecting truthfulness, hallucination. Output validation. Risk of monopolisation and centralised access to LLM as a service | Other model provider API | Python | Working with OSS LLMs | |||||||||||||||
32 | Co-founder and CTO | 1-10 | Yes | Prompt autogeneration and synthetic data generation | Reproducibility | Hallucination and latency | Open AI API | HoneyHive | Fine tunning LLMS | Multi-responses shown to user side by side, ghost writer style dropout | How are you thinking about tracking user feedback in production? | ||||||||||||||
33 | 50-500 | Yes | Working with OSS LLMs | ||||||||||||||||||||||
34 | Senior Software Enginner | 50-500 | No | No business need yet | |||||||||||||||||||||
35 | Co-founder | 1-10 | No | We're learning | |||||||||||||||||||||
36 | ML Developer | 500-1,000 | Yes | Question Answering | yes | latency | latency, factuality | Open source model (GPT-J, etc) | Inferenece for LLMs | ||||||||||||||||
37 | 1-10 | No | Cost, Latency, Accuracy | Cost can be high, hallucinations | |||||||||||||||||||||
38 | data scientist | 500-1,000 | No | not really in our domain | N/A | no - closest we've got to dealing with language is fuzzy lookups | N/A | Nope - but I'm excited to learn more | |||||||||||||||||
39 | Senior Machine Learning Engineer | 10-50 | Yes | Question generation for accountants | built internal | reducing latency | not knowing the data used for the initial training | In house model | TensorFlow, TFX | Inferenece for LLMs | yes | ||||||||||||||
40 | Founder | 1-10 | Yes | Zero-shot/Few Shot classification for data enrichment | Internally, we use BERT to create embeddings, then train a smaller model on customer data. | Infrastructure costs. | Is what I;m building getting obsolete in 3 months, | In house model | In house Bert (from Huggingface) + Open AI | Fine tunning LLMS | I am stuck with this right now - can I integrate these into the UI for AI Hero. | ChatGPT API output is different every time for the same input - makes it hard to create a reliable service. | How are you managing training/deployment of these? | ||||||||||||
41 | SRE | 1,000+ | Yes | analyze logs | no | Analyzing results | security | In house model | in house built tool | Fine tunning LLMS | Very manual so far. | confidential | no | ||||||||||||
42 | Cofounder (normally product manager, but also wearer of many hats) | 1-10 | No | 1. gcm diagflow is good for complete basic dialogs, 2. halucinations 3.a. how to make answers be focused on what my company is doing rather than general answers, 3.b. how to keep that evolving as we learn more about the customers 3.c. how to prevent a practical joker who's conversing with my llm to inject some practical joke into it 4. if im relying on openai-s llm, i need to always get a reply what happens if they're busy, whats the proverbial 'failover' | how to get it to use my company's specific knowledge | not thoroughly enough yet | still working on it | openai - even with monthly subscription it is not predictable output | do you use it for chatbots, how do you put guardrails on it for conversations that are going off track or how do you prevent practical jokes if you're re-incorporating the knowledge back into the llm | ||||||||||||||||
43 | Data Architect | 1,000+ | Yes | Search and chat on proprietary data | No | Automation amd scaling of transfer learning, tuning and reinforcement learning | Scaling | Open AI API | Lang chain, k8s, gcp | Learning from new/private data | Not a concern. In-house | No | Best automated/scalable pipeline for tuning,, transfer learning (augmentation), reinforcement feedback | ||||||||||||
44 | Senior software engineer | 1,000+ | No | No need atm | Reliability | ||||||||||||||||||||
45 | consultant | 1-10 | No | we'll be using at a startup I'm part of | cost and fit for purpose. Actual market validation | yes. I've used them for classification | they are in prod for that use case | ||||||||||||||||||
46 | CTO | 50-500 | Yes | Product (Software) recommendations, Internal Q&A, Marketing content, and Customer support | Langchain, OpenAI Playground, ChatGPT, PromptLayer, Weaviate | Rapidly changing APIs | Hallucinations misleading customers | Open AI API | PromptLayer | Embeddings | Pre/post-processing, constitutional / self-auditing, customer feedback thumbs up/down, custom embedding matrix | How might you avoid siloing all of your embeddings when different teams need different tools without building it all in house yet all would benefit from the combined collection? | |||||||||||||
47 | Data scientist / co-founder | 1-10 | Yes | Information retrieval and text generation. | We have some output validation methods and templates, basically regex matching prompts. | Input length limitation and having to work around it with non-optimal methods. | Hallucinations | Open AI API | Langchain, Huggingface, Polars | Fine tunning LLMS | We implement checking with less stochastic tools. | Content language matters a lot! | |||||||||||||
48 | Analytics engineer | 50-500 | Yes | Chatbot | In POC at the moment | Vectorizing CRUD and ci / CD | To get better at semantic search | Open AI API | Langchain, pinecone, slack and aws | Embeddings | How to better deploy python apps using aws and operationalizing it. How to better manage vector indexes. | ||||||||||||||
49 | Director, Innovation | 50-500 | No | Data privacy concerns | Difficulty or ease of maintenance; costs; data privacy | Not yet | na | None yet | Are there any standards/best practices to deal with data privacy? | ||||||||||||||||
50 | ML & MLOps practice lead | 50-500 | Yes | Text classification, text generation | Compute performance, fine tuning challenges, explainability | Open source model (GPT-J, etc) | Inferenece for LLMs | ||||||||||||||||||
51 | Founder | 1-10 | Yes | Social media idea generation | No just used langchain and OpenAI | Prompt engineering; handling very large documents | Bad data/completions; runtime errors parsing completions (we use regex to parse each line ) | Open AI API | LangChain | Fine tunning LLMS | We don’t do anything right now | ||||||||||||||
52 | CTO | 1-10 | Yes | Semantic data extraction and corporate governance | I'm unable to utilise LLM's currently due to data security restrictions, however as a OpenAI beta tester, around 2 years ago I created a business Operational Delivery Framework for a £500M nuclear project. I then transferred the learning framework to £5B project. | Environment dependency conflicts | Safe and ethical usage | Open AI API | Use case dependent | Fine tunning LLMS | AI is replaced by HI(Human) for final check, always. | Yes, the ODF I created was regarded as best in class by regulators and has been suggested to be implemented as new gold standard. I used GPT3 to advise on the design of the framework, and to write all the necessary code to create it. | Nope, no questions, just a great big thankyou, and a virtual high five for all the awesome peeps 🙏 | ||||||||||||
53 | MLOps Engineer | 50-500 | No | Still evaluating infrastructure requirements and ways to make it traceable if things go wrong | Retraining, tackle hallucinations and avoid wrong results | No | |||||||||||||||||||
54 | Junior developer | 1-10 | Yes | Creating a chat bot to query data efficiently for managers | Not yet but its in production. Using langchain. | The main challenges are using the different tools from langchain to optimise for the use case. There are a lot of things to consider. Also, we use ts and angular front end so not having all the tools that are available in python required us to have python as the backend now. | It’s how useful/valuable it will actually be for the client. | Open AI API | Langchain/ts/csharpe/angular. | Fine tunning LLMS | We havent gotten to that yet. | They are amazing. But as mentioned, a chatbot alone is probably not enough to be useful, hence building out other capabilities will become important. | |||||||||||||
55 | deep learning engineer | 1,000+ | Yes | extract information from image and text of medical domain | no | deploy models into production | hight training & inference cost | Open source model (GPT-J, etc) | transformers, triton inference server | Working with OSS LLMs | |||||||||||||||
56 | Co-Founder | 10-50 | No | Lack of use cases that require so much sophistication | cost of maintenance, lack of institutional knowledge | No | Have previously tried using them, however, in many cases, LLMs aren't all that reliable. I tried to use it as a chatbot for a client but it didn't work as expected in many cases which is why didn't end up using it in production. | None | |||||||||||||||||
57 | Intern | 1,000+ | No | Because I am in education sector. | when you use LLM in production it can fail any time | NO | |||||||||||||||||||
58 | CTO | 1-10 | Yes | Coding | VSCode Extension | Work with 3 AI providers at the same time | The wrong respond from the model | Open AI API | Code GPT https://codegpt.co | Embeddings | Now it is just responds tickets in github | The extension hace more than 200,000 downloads and it work great! | |||||||||||||
59 | Software Development Engineer | 1,000+ | Yes | (Not in my org, at work I don’t do this, it’s personal) Cover letter generator based on resume and job description | Embedding with appropriate data effectively | Being dependent on one company for the API so far. Having models of equivalent performance whose weights were public would be a great relief | Open AI API | Gpt index, langchain | Embeddings | Prompt engineering, allow user to be specific, and of course re-generate | |||||||||||||||
60 | Data Science Team Lead | 10-50 | Yes | Topic sentiment evaluation on text | Managing training data labeling | explainability | Open source model (GPT-J, etc) | AWS Sagemaker | Fine tunning LLMS | Test each new model candidate on test set | |||||||||||||||
61 | Sr. ML Eng | 50-500 | No | inference cost vs. impact on revenue | how to reduce inference cost | ||||||||||||||||||||
62 | Senior AI Expert | 50-500 | Yes | Medical Diagnostics Support | No, we rely on open source tools | Explainability, Testing, Getting ground truth data | Trustworthiness | Open source model (GPT-J, etc) | Milvus DB, FastAPI, Hugginface, Pytorch, etc. | Working with OSS LLMs | We try to make the output as transparent as we can (using explainability techniques) and not relying upon LLM for the end decision (these are used as a support tool) | ||||||||||||||
63 | Data Science Engineer | 50-500 | Yes | Assistant chat bot | Langchain. Will serve the bot from a flask server, being set up for the bot, into our existing java platform | Langchain LLM Parse error. Turbo not using multiple tools for some reason, when davinci-003 did | Need HIPAA compliance, so trying to get on Azure. Worried about the contract falling through | Open AI API | Langchain. All my agent’s tools are custom-built. I hit our internal APIs with them | I want deep knowledge on how to make the ReAct pattern work in various edge cases | Will have friendly internal users using it first in the course of serving customers, and giving us per-message feedback | davinci-003 hallucinating using an existing tool without actually using it! | How should I split my initial prompt up between turbo’s System prompt and the first Human prompt I give it? | ||||||||||||
64 | Head of Data and ML | 10-50 | No | No Use case, being a fintech startup, besides helping with processing documents | why waste the money | no | |||||||||||||||||||
65 | ML/MLOps Engineer | 500-1,000 | No | No clear use case | Running them, fine tuning, cost | Nope | |||||||||||||||||||
66 | Data Scientist, HappyFresh | 500-1,000 | No | Not | |||||||||||||||||||||
67 | Data Scientist | 10-50 | Yes | Automation of report writing. | Yes. Responsible for the design, building, testing and deployment of a GPT-3.5 application that processes numerical data and free-response text in tabular form to generates commentary and analysis for a visual report. Researchers are only required to upload a zip folder(1-click). Technologies: AWS lambda, Docker (ECR/ECS), GPT3.5, PyTorch. | control of the output for our usecase. Reports need to understand the client's intent and take information from briefing calls and key erquriements. These can help guide the output, but now always. | Thta it will hallucinate something that we won;t pick up in the report editing phase | Open AI API | Torch and Pandas | Prompt Engineering (Riley Goodside) | Have editors comb through it | We aim to turn a report around from survey finishing to client inbox in 24 hour. So far we are at 48 hours. | |||||||||||||
68 | Data Scientist | 50-500 | Yes | LLM in Marketing, Custom data | HF Inference Endpoints | Cloud Cost | No proper documentation | Other model provider API | Fine tunning LLMS | ||||||||||||||||
69 | Senior Data Scientist | 500-1,000 | No | We don't have a need for generating unstructured data, we work in small languages, our text is very domain specific, LLMs are very compute intensive -> expensive to run and take a lot of work to set up. | Cost, latency, actual performance in the languages / domains we work with (real languages, not programming) | Nope. Tried some BERT for classification but that hardly counts as an LLM. And I've made models as big as BERT I think. | Why are most of the models on Hugging Face based on WordPiece and BPE tokenizers? Give me bytes or give me death | ||||||||||||||||||
70 | Data Scientist | 50-500 | Yes | chatbot/voice assistant | just used RASA for chatbot creation and haystack for open domain question answering | currently used only pre-trained models, biggest challenge was to obtain data for specific domain fine tuning | it is very important in my opinion to accurately monitor the kind of usage to prevent frauds or malicious intentions | Other model provider API | Haystack | Fine tunning LLMS | it depends from specific context and specific use cases, in my experience I rely on validation set and a validation tool that estimates accuracy of answers based on answers manually scrutinized by humans | not so far | it would be interesting a discussion about how to control the output and in general the kind of usage the system has been exposed | ||||||||||||
71 | Head of Data Science | 1,000+ | Yes | Entity matching, customer service responses (souped up/targetted FAQ). | yes - high scale server on K8's | K8's caches... TESTING. | cost, latency, truthfulnes - but also Maybe management in the sense of versioning and resolving / attributing issues? Copyright and intellectual property liability - who knows what it's been trained on and what it might spit out... What happens if it provides a Disney script for an advert? (I know the answer - nothing good). Safety : what if it tells someone to kill themselves? Or spits out a recipe to make smallpox? (I am being kind of silly here, but... not entirely) | Open source model (GPT-J, etc) | not sure what this means - we use k8's and cloud functions to run them, also interface to case management solutions. | calibration of llm output / calbration and testing of llms | badly. We do lots of testing but I constantly feel we are on shaky ground. | We had a nice time selecting models, proving it would all work and using TSNE and PCA to reduce the dimensionality of embeddings and get compression for our vector lookup system... Very good and satisfying work, we got prizes and praising from all. Then camesome dreadful times making our vector lookup system work under load for long durations in a proper prod environment (where we couldn't touch it to do things like restart it or reload the index). I think this will be a big issue in the future - we need things that are as reliable as oracle for running the inferences, we do not have them - everything is glued together (I am guessing this is not true if you work for Google...) | |||||||||||||
72 | Founder | 1-10 | Yes | We are making tools to monitor and improve the performance of LLMs | No additional tools | Hallucinations | Open source model (GPT-J, etc) | UpTrain | Embeddings | We are building a tool for that | How is everyone thinking about fine-tuning LLMs for their use-cases? | ||||||||||||||
73 | CEO | 1-10 | Yes | Embeddings, Feature Extraction, Chatbot | Database connections | Context length limitations | Hallucinations, rate limiting, | Open AI API | Langchain, Llama Index, Deep lake, Weaviate | Fine tunning LLMS | Error handling | ||||||||||||||
74 | Founder | 1-10 | Yes | Text summarization, code generation | No | Dealing with vector stores | Keeping agent on track with system (OpenAI’s ChatGPT) or role | Open AI API | LangChain | Embeddings | N/A | N/A | What are some of the best practices for dealing with LLMs or foundation models? | ||||||||||||
75 | Senior Dara scientist | 1,000+ | Yes | Chatbot, semantic search | Deploy open source LLM | Data privacy, big models | Fake information | Open source model (GPT-J, etc) | PyTorch huggingaface | Inferenece for LLMs | Generate multiple output | Not really | Chatgpt for domain knowledge | ||||||||||||
76 | ML Engineer | 1,000+ | Yes | Customer support | We’ve integrated with model store! | Memory requirements during inference | Slow start up times/it’s very different from regular code | In house model | Hugging Face | Inferenece for LLMs | We’re doing classification not generation | Curious to hear about other use cases! | |||||||||||||
77 | Director | 1-10 | No | No need yet | I don't need them | Yes, too heavy | Too heavy | Not really | |||||||||||||||||
78 | Technical Marketing Manager | 10-50 | Yes | Content marketing and text drafting | not at the moment in my department | costs under control and reliability | Open source model (GPT-J, etc) | Inferenece for LLMs | |||||||||||||||||
79 | Head of MLOps | 1,000+ | No | Currently assessing the risk and potential applications, performing POCs. | What are the potential risks, where are the benefits worth the costs for enterprise grade integration, how do we make these things secure, how do we monitor? | In the process of doing so. | We’re being thorough. | Anyone any good resources on analysing the risks and how to monitor LLM based solutions? | |||||||||||||||||
80 | CDO | 1,000+ | No | Wee are exploring the space but don't feel they are ready for production use yeet | How to pin them to a domain, how to verify the veracity of teh outcome and how to best chain them with APIs | We are exploring customer facing and colleague facing applications | Concerns about sticking to a domain and about veracity if the answers | What use cases do people have? what tooling do they recommend? how do they deal with the halucinations? | |||||||||||||||||
81 | Owner | 1-10 | Yes | sales pipelines, text generation, data classification and many more | Yes, several | The need to use Python | Cost, accuracy | Open AI API | Zapier | Finetuning and embeddings | extensive testing | ||||||||||||||
82 | Agricultue consultant | 1,000+ | No | $ | $ | No | No | No | No | ||||||||||||||||
83 | Data scientist | 1-10 | Yes | Yes | Computing power | Open AI API | Fine tunning LLMS | ||||||||||||||||||
84 | Product Owner Data Science | 50-500 | No | MLOps and DataScience rather young discipline. We are currently working on an LLM use case though | Runtime, Drift detection | No | |||||||||||||||||||
85 | Team Lead DS & MLE | 50-500 | No | Domain specific challenges for what we do. Digging into how we could train LLM to fit the needs. | What is it? There is just the huge gap in understanding about analyzing text vs generating text. | We were using s-BERT in our primary recsys model. Recent modeling efforts have reset to simpler systems that are performing better. Still using it in a 2nd model for a clustering challenge. | |||||||||||||||||||
86 | Co-founder | 1-10 | Yes | Semantic vector embeddings as a feature into RecSys | Not yet | Sparse representations (e.g. tfidf) are easier to work with for our use-cases, because they survive simple aggregation better. | Explainability of embeddings. | Open source model (GPT-J, etc) | Embeddings | We have come up with a large list of proxy metrics that we found correlate with performance on our tasks - so we track those. | As far as embeddings are concerned, we haven't found a clear advantage of LLMs vs simpler models (tfidf) trained on a relevant dataset. Both require elaborate preprocessing of the embedded documents for example. | I'd love to connect with others who use LLMs embeddings and talk in more detail :) | |||||||||||||
87 | Machine Learning Engineer | 50-500 | No | it isn't in focus on my leaders to use LLMs right now | Two key questions I do myself on LLM in prod are: 1 - How is the best strategy to train or fine-tune a LLM ? 2 - How to manage infrastructure costs ? | not yet | . | . | What tools are using to training or fine-tune theses models and a case on deploy a model as an API how has been the issues ? | ||||||||||||||||
88 | CEO | 1-10 | No | No current need, but also tooling | Reliability, safety, certainty, hallucinations, etc | No | |||||||||||||||||||
89 | Principal Cloud Developer | 50-500 | Yes | text resume, make ppts, text quizzes, text to flash cards, video resume etc,etc | - | - | bad output, costs etc,etc | Open AI API | - | Fine tunning LLMS | - | - | - | ||||||||||||
90 | CPO | 1-10 | Yes | We help organizations transition Excel reports to Python | Yes -- our product, Mito, is a spreadsheet that generates Python code as you edit it. We have integrated the chatgpt api into Mito so users can generate code using the LLM. Then, they can use the Mito spreadsheet to verify that the code it generated is correct. | We are not yet sure how large enterprises are going to adopt LLMs. Our large enterprise customers are currently not using the Mito AI feature because there is too much uncertainty. | Open AI API | Just the Mito spreadsheet | Figuring out how to deploy LLMs for enterprise use | We give users a spreadsheet interface to verify the output of the LLM. Spreadsheets are great for understanding changes to data! | |||||||||||||||
91 | Cloud MLOps Engineer | 10-50 | Yes | Translate french text to english | Yes we have used opus-mt model in containerised environment for translation | It is pretty hard to work with this models as they need lot of resources to run which significantly increase the cost also | Need of resources | Other model provider API | Docker kubernetes aws resources | Inferenece for LLMs | |||||||||||||||
92 | CEO / Co-Founder | 1-10 | Yes | Writing SQL Queries from Natural Language. | No | Getting it to write a query was relatively easy. Getting it to write a query that would actually be correct AND execute is much harder. | My concerns at this point are primarily focused around getting the LLM to generate an executable query in all cases. | Open AI API | Apache Drill, Python. | Fine tunning LLMS | |||||||||||||||
93 | Senior MLOps engineer | 1,000+ | Yes | Adverse media about suppliers. | No, but have plans to do so! I will be involved in the feature store architecture. | The unpredictable nature of responses. | That they give out results that are way off. | Other model provider API | Databricks, Azure | Embeddings | Not well :D | None yet | |||||||||||||
94 | machine learning engineer | 1,000+ | No | Legal said not to | How do we keep costs down? How do we validate the outputs? | nope | |||||||||||||||||||
95 | CEO | 1-10 | Yes | Development of production tools for deployment/maintenance of LLMs and other models | We have a platform to develop/run models in production called Statfish | Accuracy and hallucinations | Can't rely on answers 100% of the time | Open source model (GPT-J, etc) | Jupyter, kubernetes, haystack | Inferenece for LLMs | Working on that problem | ||||||||||||||
96 | ML engineer | 50-500 | No | currently using classic ML and I plan to introduce LLM capabilities to my org | cost of training and inference, time, gpu, .... | no | |||||||||||||||||||
97 | Domain Chapter Lead - Data Science and AI | 1,000+ | Yes | Summarisation of customer calls | nope | View of accuracy and quality of results | Reliability | Open AI API | Azure stack | Embeddings | no framework yet | not much from our side yet | How we manage quality of LLM outputs ? | ||||||||||||
98 | Senior Machine Learning Engineer | 1,000+ | No | Lack of applicable business use cases and resources | How do you serve them? | Only for a developer | |||||||||||||||||||
99 | VP of ML | 50-500 | No | Expensive, no ROI, complex, dangerous | Is it reliable? How do we know it's reliable? | Yes | Expensive and the business use case was poorly defined. Just a shiny toy | Giggles. | Use case examples would be interesting, especially where things have gone wrong. | ||||||||||||||||
100 | Technical Lead | 50-500 | Yes | POC, evaluation for healthtech | n/a | supplement/support vs replace | user risk | In house model | hf | Fine tunning LLMS | cross fingers | not that i can share cheaply, ha | Growing LLM features iteratively & coherently vs Alphabet's "put LLM in all the things!" |