We’re excited to share that Aparna Sinha is Pear’s newest Partner! Aparna has made an outsized impact in her time with us as a Visiting Partner, and we couldn’t be more thrilled that she’s joining our team full time.
Aparna brings a strong thesis and a depth of experience in enterprise, developer, and AI to Pear and is excited to work with ambitious founders during this breakout moment in AI. “We’re in the midst of unprecedented technological advancement. AI is re-shaping our present and enabling the most significant breakthroughs of our lifetimes. The breadth and pace of this technological shift creates an opportunity for startups to disrupt the value chain and reshape how we interact with our world,” says Aparna. “Pear’s co-founder is a female former founder with a PhD in engineering, and our work to grow top female technical entrepreneurs resonates with me,” she continued.
Over the last six months, Aparna helped launch PearX for AI, a new track of the PearX program tailored towards AI builders. We recently welcomed six teams to the inaugural PearX for AI cohort and are excited for these founders to debut their companies with the world soon. Through the PearX for AI program, Aparna is partnering with founders to define their product, win early customers, and grow. She’s able to leverage her deep experience from Google and in enterprise software to give Pear’s portfolio companies an advantage in the market.
Aparna is also building out our AI advisor community, connecting Pear founders with industry experts from organizations like Stanford, Google, OpenAI, Hugging Face, McKinsey and more. At this moment, AI is touching every single facet of technology and she’s been working to build a top notch council of advisors to assist Pear founders on their entrepreneurial journeys.
We’re so excited to welcome Aparna. If you’re a founder looking to connect, please email her at aparna@pear.vc.
Last week, we welcomed Naomi Chetrit Band to the Pear team to lead our Dorm program as a Senior Associate on the investment team. At Pear, we have a long history partnering with students to build the next wave of category-defining companies. We met the founders of companies like Affinity, Viz.ai, Nova Credit, and WindBorne when they were still in school, and we love partnering with students to build businesses from the ground up. With Naomi joining our team, we’re excited to take our program to the next level.
Naomi is excited to concentrate on supporting student founders. “My lifelong excitement for learning and education finds its natural home on campuses and inside classrooms. Living in Israel, surrounded by a vibrant startup ecosystem, I developed a strong inclination towards working with founders and supporting early stage startups. Being at Pear now allows me to blend both of my passions, for which I am truly grateful,” says Naomi.
Naomi joins us from the Wharton School, where she just completed her MBA. She also worked as a Pear Fellow during her time at UPenn. Naomi is also an Israeli CPA and Attorney with a career spanning EY-Parthenon, EY, and S. Horowitz & Co.
Welcome to the team, Naomi! If you’re a student builder or want to learn more about our Dorm program, you can connect with Naomi on Twitter, LinkedIn, or at naomi@pear.vc.
Louis’ journey to finance and accounting began with his life-long card playing hobby. He loves working with numbers and rules, and describes himself as quietly a very competitive person. During high school and college, Louis was a professional-caliber Magic: the Gathering player, peaking as a top 300 player in the world while he was in graduate school, but his first love is cribbage, which he grew up playing with his family.
Professionally speaking, Louis is an expert at VC finance. Prior to Pear, Louis worked across finance, tax, and fund accounting at firms like Emergence, Ohana Real Estate Investors, and Foundation Capital. At Pear, Louis’ role involves managing the financial functions of the firm and working across our four funds. He closely monitors budgeting and operating finances, portfolio valuations, LP reporting, and manages the annual tax and audit process. He’s also focused on helping Pear build scalable finance processes that will meet Pear’s needs as we continue to grow.
“I am excited to be at Pear because we are working with founders at the very start of their journey. It’s so fun to be with them from day one, with a ton of excitement and opportunity in front of them. I am also excited to be working with such a humble, collaborative team. It’s fun to feel like we’re all pushing in the same direction, building something together,” Louis says.
Outside of work, Louis lives in the Bay Area with his wife, two young children, and two cats. Questions for Louis? You can reach him at louis@pear.vc or find him on LinkedIn.
We first met Federato’s Co-founders Will Ross and William Steenbergen in March 2020, when they were first year grad students at Stanford’s Graduate School of Business. They were winners of the 2020 Pear Competition and we also invited them to join PearX, our early-stage bootcamp for founders.
When we met Will and William, they only had a product concept and some initial customer validation. But even though they didn’t have solid proof yet, we decided to partner with them in building Federato for a few key reasons:
First, we saw a big market opportunity. New risks like climate change, cyber security, and social inflation were changing the landscape. In the insurance industry, risk is the opportunity, and it’s a really hard problem to solve. Insurers operating processes are unable to accommodate emerging risks, but Federato brought a solution to the table to help insurers take on risks of under utilized data assets. We knew at the time that climate change was already affecting the insurance industry in dramatic ways. The elevated frequency of damaging weather events drove more than $100B in uninsured losses between 2018-2019 alone, and this number has only continued to grow since then.
Second, even though they were early in their journey, the founders had a clear vision. We believed in their vision to bring AI into how insurance companies manage the risks associated with an ever-changing world, including the elevated frequency of damaging weather events caused by climate change. They concluded that the best way to achieve this was through a federated learning mechanism (hence the name Federato) that would allow insurance companies to benefit from their own data, as well as other entities’ and insurance companies’ data, safely. We also appreciated that their solution delivered a simple, convenient, and beautiful UX experience, where every interaction was optimized for the user.
Third, we believed in the team from the get go. In their early days, they described themselves as “two deeply passionate, data science/product people who came together to do something about climate change with machine learning.” Will conceived of the concept behind Federato when he was an Associate at Venrock and William built ML models for the insurance industry at his prior startup, Building Blocks. Together, they researched and deeply understood the space. They didn’t just bring us an idea on a slide deck, but instead they brought a thoroughly thought out plan with multiple in-depth customer interviews, a light proof of concept built on publicly available data, and a clear understanding of the end user and end buyer. We could see that this was a team with a clear analytical mind and a bias for action, which is a rare occurrence.
During PearX, the team coined the term RiskOps, which is about realizing that risk cannot be priced without taking distribution into account. This is a tricky concept to explain clearly, but we worked closely with them to articulate this vision at Demo Day. We also partnered closely with Will and William in developing the first version of their operational underwriting software that continuously monitors risk at every underwriting decision, rather than only a few times per year. Arash and I remember working together to create hand drawn mockups of their initial software over long Zoom meetings during the height of Covid lockdowns.
Shortly after presenting at Pear’s S20 virtual Demo Day, Federato closed a Seed round led by Caffeinated Capital. Between their Seed and Series A rounds, we worked with them on important company-building milestones, like refining the product, building a strong company culture, and navigating long sales cycles through acquiring their first few customers.
We also helped the Federato team prepare for a successful Series A raise through our Series A Bootcamp. They successfully raised their Series A from Emergence in 2022.
In less than a year following the Series A raise, Federato proved itself even more. They truly became an economically efficient marketplace that connects data to the value it can actually create in underwriting. In this year, Federato team tripled their customer base, doubled their spend with existing customers, and entered new segments.
Riding off of this strong momentum, they just closed their Series B round from Caffeinated, Emergence, and Pear, and we’re excited to continue working with Will, William, and the entire (growing!) Federato team on their mission to modernize the insurance industry!
We recently launched a dedicated AI track to the PearX program and have received a great response. Founders often ask us for guidance on how to build a moat for their AI startup. There are many aspects to this question but to kick things off, we are sharing a presentation I gave at SF Tech Week that covers background on the emergence of Generative AI, the highest priority areas of application particularly in enterprises, and what we believe enables a ‘moat’ for AI startups.
Generative AI Tech Stack Presentation at SF Tech Week
Generative AI is a game changing technology for humanity. A quote from one of my heroes, Professor Fei Fei Li at Stanford, and also was head of AI at Google Cloud for a while captures the excitement well:
“Endowing machines with generative capabilities, has been a dream for many generations of AI scientists”
Seminal technologies which have led to the recent Generative AI breakthroughs include Cloud computing and within that advancements in GPUs, Kubernetes and open source frameworks like PyTorch provide an efficient and widely accessible substrate for model training and inference.
Research breakthroughs on the transformer neural network, its use on internet scale datasets and recent advancements in AI alignment are at the heart of most of the Generative AI capabilities today.
By no means are we at a peak yet, as research continues to improve efficiency at the hardware, software and services layers. Most interestingly to increase context lengths and optimize AI application architectures for accuracy, latency, and reliability. We cover some of these topics in depth in our Perspectives on AI fireside series.
It is clear that Generative AI techniques apply to multiple modalities. There has been a steady stream of models, both open source and proprietary in the major areas of NLP, Image, Video, Voice and also physical synthesis of Proteins.
Applications to both consumer and enterprise software abound and are already starting to change the shape of what software can do. We highlight some of the opportunities to build vertical and horizontal applications as well as tooling and infrastructure.
Of course there is hype when it comes to Generative AI, and in some sense it is almost too easy to create new functionality by building a thin layer over a foundation model that somebody else has built. While there are some businesses to be built in that way, for a venture scale business, we posit that a deeper moat is required to build. A large business that benefits rather than crumbles from rapid evolution of technology at the lower layers of the AI stack requires several moats.
Our thesis is that Applications will be composed of ensembles of specialized models, not just foundation models, but specialized models that are customized via fine tuning or in-context learning or a range of other techniques to complete part of a use case or workflow. These specialized models should utilize proprietary data specific to a domain and help to personalize the output of the application as well as ensure accuracy. A by product may also be lower cost to serve. Overall such an architecture will be a way to build lasting value and be more immune to disruption.
Tooling and infrastructure supporting the development of new applications of this kind is second part of our investment thesis. In particular, data and tooling companies to evaluate and ensure safety, accuracy, and privacy of these applications will be in demand. Lastly a few new infrastructure companies and capabilities will advance the development of these applications. We see emerging companies at every layer of the AI stack (slide 9). With that thesis in mind, building a moat is fundamentally not that different in AI than in any other emerging space (slide 10).
Enterprise readiness for adoption of AI is arguably higher than it has ever been with the widespread acceptance of cloud computing, API integrations, and existing investments in data analytics teams and software. The hurdles to enterprise adoption are also not new, these are the same requirements that any cloud service has to meet, with perhaps a stronger need for ease of use and simplicity given the lack of existing AI/ML talent.
We conclude by quoting what many others have already said, that this is a great time to start a company!
Last week, PearX S19 alum WindBorne Systems announced their $6M seed round led by Footwork Ventures and joined by Pear, Khosla Ventures, and others. We love WindBorne’s founding story as it embodies everything we believe at Pear about mission-driven, high perseverance founders.
I was lucky to meet the founders of WindBorne back in the spring of 2019, when they were undergraduate students at Stanford University. They showed up to my office hours with a balloon in tow, similar to the one pictured below.
I had no idea that the homemade-looking device was a low cost, highly durable weather balloon that could fly at a wide range of altitudes. They called this balloon ValBal for (Vent to sink, Ballast to rise).
We decided to partner with WindBorne for two reasons:
First, we believed in the founding team. They had demonstrated a clear mission, a strong passion, and an incredible ability to execute. They’d been addicted to this idea since they were freshman at Stanford as members of the Stanford Student Space Initiative. Constrained by a student budget, they used ingenuity and engineering to build weather balloons that would fly for a few hours and then pop. Over the years, they kept adding sensors and extending flight time, steadily improving their product. By the time I met them, they had already completed 26 test flights and broken four world records.
Second, they were capturing data that no one else was capturing. At the time, we were not 100% sure of the value of such data, but after making a few calls, we discovered potential customers were interested in learning more about what the balloons could do.
With that, we invited WindBorne to join PearX’s Summer 2019 cohort, where we partnered closely with the founders to understand the most attractive market opportunity to go after. Through customer interviews, they discovered a massive data gap: only about 15% of the Earth’s surface has regular in-atmosphere observations, but weather is a global system. To better predict the weather, you need better weather data. Existing technologies couldn’t plug that gap, because the laws of physics prevent satellites from collecting the most critical weather observations. That’s where balloons come in: they’re able to collect data where satellites can’t.
Investors were convinced of the potential, and following S19 Demo Day, Windborne closed a successful seed round led by Khosla and Ubiquity Ventures. They spent the last four years growing the company: from iterating on their go-to-market strategy to team building to customer interviews to strategy sessions and more. We had the privilege to work side-by-side with John, Kai, Joan, Andre, and the entire WindBorne team every step of the way.
Last fall, the WindBorne team went out to begin raising their next round, in the midst of a tough economic climate. As PearX alums, we invited them to participate in our PearX S22 Demo Day, where Kai shared the latest happenings with thousands of investors in the Pear network. We’re so proud of the entire team for this next step in their journey.
I recently visited the new WindBorne HQ in Palo Alto, just a few weeks after the team moved in, and I left with a smile on my face. The offices were scrappy, the team was as determined as ever, and they were telling me all about the new advances with their product. I couldn’t help but look back and remember the first time they ever walked into my office as Stanford students and feel proud of how far they’ve come. I know this is just the beginning for WindBorne. We are so excited to continue to partner with them and to welcome Footwork and others to the team.
San Francisco and Los Angeles Tech Weeks are now complete, and we were excited to be a part of the celebration again this year! We hosted four events this year. Thank you to the founders and entrepreneurs who joined us at our events. Here’s a quick recap of our Tech Week events:
Hiring your first engineer
To kick us off, Pear’s Talent Partner Matt led a presentation and discussion on hiring your first engineer. Matt shared the strategy and process behind what hiring looks like for a first time founder – sharing advice and insight on timing, what to look for, and how to attract, find, assess and close your first hire.
Enterprise opportunities in generative AI
We pushed the capacity limits in our quaint SF office with nearly 100 guests for our Generative AI talk, led by Aparna. She presented on the latest trends, technologies, and applications in generative AI, specifically as it pertains to early founders.
The Gusto founding story
For our final SF Tech Week event, Pear’s Founding Managing Partner, Pejman, sat down with Gusto’s CEO and Co-founder, Josh Reeves. They discussed the journey building Gusto from the ground up into a platform for 300,000+ businesses that is worth $10B today. Josh’s biggest piece of advice is to always focus on the customer first and foremost.
What Apple Reality Pro means for XR
Last week we moved south for LA Tech Week, where Pear’s Partner Keith held a discussion in sunny Santa Monica exploring Apple Reality Pro and the ripple effects it will have on XR. It was great to meet so many XR founders, builders, and enthusiasts of the space!
Thanks to all who joined us for SF and LA Tech Weeks. We are so grateful to our community near and far!
Looking to connect with someone from the Pear team? Head over to our team page and feel free to reach out, we’d love to hear from you. If you’d like to join us at a future event, keep an eye on our events page.
Last month, I had the immense privilege of helping judge Pear VC’s Stanford student Competition. The event highlighted the brightest student founders from Stanford vying for their first check. The Pear Competition has a history of identifying and nurturing exceptional talent, supporting unicorn companies such as Viz.ai and other breakout companies including Nova Credit, Federato, Conduit Tech, and Wagr.
Pre-seed is a notoriously hard stage to invest, as startups often lack any metrics, product, traction, or proven revenue model. The excitement and challenge lies in the ability to identify hidden gems despite the uncertainty, requiring a skillset that combines intuition, experience, and a deep understanding of market landscapes across a wide range of industries.
With the invaluable experience of judging and doing diligence on close to 100 founders alongside renowned investors and proven operators, Mar Hershenson and Ilian Georgiev, I wanted to share 5 key takeaways in pre-seed investing from one of the best early-stage VCs:
1. Passion and Market Insight
Impressive founders had a deep understanding of their market, derived from a unique blend of professional experience, customer interviews, and thorough research. They could clearly pinpoint “hair on fire” problems and delve into pain points along the customer journey in excruciating detail, ultimately laying the foundation for a compelling vision for the problem they aim to solve.
These founders masterfully answered highly nuanced follow-up questions, while still demonstrating a humbling awareness of what they still needed to learn.
2. High Learning Rate
Another exciting key trait of founders was a demonstrated “high rate of learning”. These founders were unafraid to openly discuss assumptions and hypotheses that were proven wrong, providing insights into how their understanding of the market and potential solutions continually evolved. This grounded reflection illustrated their willingness to pivot when necessary, ensuring they could navigate the inevitable uncertainties of the startup journey.
3. Execution Velocity
Several founders stood out with their relentless drive to move fast. They leveraged no-code tools, pounded the pavement to connect with customers, and used smokescreen tests to gauge demand. These tenacious entrepreneurs consistently found ingenious, low-cost, and scrappy ways to rapidly test hypotheses, never allowing a single obstacle to halt their progress. They made do with what they had, not waiting on “ideal” resources or the “perfect team”.
4. Commitment
High commitment and perseverance was another trait we looked for in founders. Despite the glamour of eye-catching TechCrunch headlines, the reality is that the founder journey is an uphill marathon. Most startups must navigate the treacherous “pit of despair” for an average of 18 months before achieving product-market fit. A demonstrated ability to weather these upcoming challenging times after the initial excitement fades is a vital asset to tackle the inevitable hurdles of entrepreneurship.
Some of these founders had a history of starting previous businesses, often grappling with numerous setbacks and pivots. They could detail stories of struggles and challenges they faced in their founding journey, demonstrating a balance of grit and determination to continually refine their craft.
5. True Meaning of a “No”
Perhaps the most insightful lesson was that a “no” from a VC often does not mean that the founder or the business wasn’t exceptional. Many factors can contribute to a “no” despite an impressive company, such as a competing investment, the market size, or a mismatch with the VC’s sector focus. Founders often forget that building an outstanding business and securing funding from a specific VC are distinct pursuits. Never let a single “no” derail your founding journey. Embrace the challenge, learn from the feedback, and keep building. Success is not solely defined by the checks you secure but by the impact you create through the relentless pursuit of your vision.
In essence, the art of pre-seed investing lies in recognizing founders who possess a unique combination of passion, adaptability, and resilience. These entrepreneurs are driven by their vision and demonstrate an uncanny ability to navigate uncertainty, making them invaluable assets in the early-stage startup ecosystem, and proving that success is ultimately measured by the tenacity to transform a compelling idea into a lasting impact.
Guest post written by Alex Wu, a Pear Fellow at Stanford.
We’re excited to announce that Arpan Shah will be Pear’s newest Partner. A Visiting Partner for the last year, as well as former Robinhood founding engineer and founder of Flannel (acquired by Plaid), we couldn’t be more thrilled to have him permanently onboard.
An alumni of Pear Garage, Arpan has always embodied the people-first Pear ethos and now follows the operator-turned-investor journey. He will continue working on investments in his wheelhouse of Fintech, developer tools as well as data platforms and AI.
“I’m excited to find companies that have more innovative approaches that are both scalable and cost efficient in this world where more and more data will be used in more and more interesting ways.”
As a Visiting Partner, Arpan has supported portfolio companies at the intersection of Fintech, AI and Data. He’ll continue providing his expertise with PearX for AI (the first cohort of which is still open for applications).
“I really like working with founders who are trying to build companies that seem ridiculously hard. Those are the types of founders that I think are quite exciting, because they’re really motivated to not pursue small wins, but really make transformational change happen in an industry.”
We recently hosted a fireside chat on Generative AI with Nazneen Rajani, Robustness Researcher at HuggingFace. During the discussion, we touched on the topic of Open source AI models, the evolution of Foundation Models, and frameworks for model evaluation.
The event was attended by over 200 founders and entrepreneurs in our PearVC office in Menlo Park. For those who couldn’t attend in person, we are excited to recap the high points today (answers are edited and summarized for length). A short highlight reel can be also found here, thanks to Cleancut.ai who attended the talk.
Aparna: Nazneen, what motivated you to work in AI Robustness, could you share a bit about your background?
Nazneen: My research journey revolves around large language models, which I’ve been deeply involved in since my PhD. During my PhD, I was funded by the DARPA Explainable AI (XAI) grant, focusing on understanding and interpreting model outputs. At that time, I worked with RNNs and LSTMs, particularly in tasks involving Vision and language, as computer vision was experiencing significant advancements. Just as I was graduating, the transformer model emerged and took off in the field of NLP.
Following my PhD, I continued my work at Salesforce Research, collaborating with Richard Socher on interpreting deep learning models using GPT-2. It was fascinating to explore why models made certain predictions and generate explanations for their outputs. Recently, OpenAI published a paper using GPT4 to interpret neurons in GPT-2, which came full circle for me..
Currently, my focus is on language models, specifically on hallucination factuality, interpretability, robustness, and the ethical considerations surrounding these powerful technologies. I am currently part the H4 team at Hugging Face, working on building an open-source alternative to GPT, providing similar power and capabilities. Our goal is to share knowledge, artifacts, and datasets to bridge the gap between GPT-3-level models and InstructGPT or GPT-4, fostering open collaboration and accessibility.
Aparna: That’s truly impressive, Nazneen. Your background aligns perfectly with the work you’re doing at Hugging Face. Now, let’s dive deeper into understanding what Hugging Face is and what it offers.
Nazneen: Hugging Face can be thought of as the “GitHub of machine learning.” It supports the entire pipeline of machine learning, making it incredibly accessible and empowering for users. We provide a wide range of resources, starting with datasets. We have over 34,000 datasets available for machine learning purposes. Additionally, we offer trained models, which have seen tremendous growth. We now have close to 200,000 models, a significant increase from just a few months ago.
In addition to datasets and models, we have a library called “evaluate” that allows users to assess model performance using more than 70 metrics. We also support deployment through interactive interfaces like Streamlit and Gradio, as well as Docker containers for creating containerized applications. Hugging Face’s mission is to democratize machine learning, enabling everyone to build their own models and deploy them. It’s a comprehensive ecosystem that supports the entire machine learning pipeline.
Aparna: Hugging Face has become such a vital platform for machine learning practitioners. But what would you say are the benefits of open-source language models compared to closed-source models like GPT-4.
Nazneen: Open-source language models offer several significant advantages. Firstly, accessibility is a key benefit. Open-source models make these powerful technologies widely accessible to users, enabling them to leverage their capabilities. The rate of progress in the field is accelerated by open-source alternatives. For example, when pivotal moments like the release of datasets such as RedPajama or LAION or the LLAMA weights occur, they contribute to rapid advancements in open-source models.
Collaboration is another crucial aspect. Open-source communities can come together, share resources, and collectively build strong alternatives to closed models. The compute is no longer a bottleneck for open source.. The reduced gap between closed and open-source models demonstrates how collaboration fosters progress. Ethical considerations also come into play. Open-source models often have open datasets and allow for auditing, risk analysis.
Open-source models make these powerful technologies widely accessible to users, enabling them to leverage their capabilities.
Aparna: Nazneen, your chart showing the various models released over time has been highly informative. It’s clear that the academic community and companies have responded strongly to proprietary models. Could you explain what Red Pajama is for those who might not be familiar with it?
Nazneen: Red Pajama is an open-source dataset that serves as the foundation for training models. It contains an enormous amount of data, approximately 1.5 trillion tokens. This means that all the data used to train the foundation model, such as the Meta’s LLaMA, is now available to anyone who wants to train their own models, provided they have the necessary computing resources. This dataset has made the entire foundation model easily accessible. You can simply download it and start training your own models.
Aparna: It seems that the open source community’s reaction to closed models, has led to the development of alternatives like Red Pajama. For instance, Facebook’s Llama had a restrictive license that prevented its commercial use, which triggered the creation of Red Pajama.
Nazneen: Absolutely, Aparna. Currently, powerful technologies like these are concentrated in the hands of a few, who can control access and even decide to discontinue API support. This can be detrimental to applications and businesses relying on these models. Therefore, it is essential to make such powerful models more accessible, enabling more people to work with them and develop them. Licensing plays a significant role in this context, as it determines the openness and usability of models. At Hugging Face, we prioritize open sourcing and face limitations when it comes to closed models. We cannot train on their outputs or use them for commercial purposes due to their restrictive licenses. This creates challenges and a need for accessible alternatives.
It is essential to make such powerful models more accessible, enabling more people to work with them and develop them.
Aparna: Founders often start with GPT-4 due to its capabilities and ease of use. However, they face concerns about potential changes and the implications for the prompt engineering they’ve done. The uncertainty surrounding access to the model and its impact on building a company is a significant worry. Enterprises also express concerns about proprietary models, as they may face difficulties integrating them into their closed environments and ensuring safety and explainability. Are these lasting concerns?
Nazneen: The concerns raised by founders and enterprises highlight the importance of finding the right model and ensuring it fits their specific needs. This is where Hugging Face comes in. Today, we are releasing something called “transformer agents” that address this very challenge. An agent is a language model that you can chat with using natural language prompts to define your goals. We also provide a set of tools that serve as functions, which are essentially models from our extensive collection of 200,000 models. These tools are selected for you based on the task you describe. The language model then generates the necessary code and uses the tools to accomplish your goal. It’s a streamlined process that allows for customization and achieving specific objectives.
Aparna: I learned from my experience with Kubernetes that open source software is great for innovation. However, it can lack reliability and ease of use unless there’s a commercial entity involved. Some contributions may be buggy or poorly maintained, and the documentation may not always be updated or helpful. To address this, Google Cloud hosted Kubernetes to make it more accessible. How does Hugging Face help me navigate through 200,000 models and choose the right one for my needs?
Nazneen: The Transformers Agents can assist you with that exact task. Transformer agents are essentially language models that you can chat with. You provide a natural language prompt describing what you want to achieve, and the agent uses a set of pre-existing tools, which are essentially different models, to fulfill your request. These tools can be customized or extended to suit specific goals. The agent composes these tools and runs the code for you, making it a streamlined process. For example, you can ask the agent to draw a picture of a river, lakes, and trees, then transform that image into a frozen lake surrounded by a forest. These tools are highly customizable, allowing you to achieve your desired outcomes.
Aparna: It feels like the evolution of what we’ve seen with OpenAI’s GPT plug-ins and Langchain’s work on chaining models together. Does Hugging Face’s platform automate and simplify the process of building an end-to-end AI application?
Nazneen: Absolutely! The open-source nature of the ecosystem enables customization and validation. You can choose to keep it in a closed world setting if you have concerns about safety and execution of potentially unsafe code. Hugging Face provides tools to validate the generated code and ensure its safety. The pipeline created by Hugging Face connects the necessary models seamlessly, making it a powerful and efficient solution.
Aparna: This aligns with our investment thesis and the idea of building applications with models tailored to specific workflows. Switching gears, what are some of the applications where you would use GPT-3 and GPT4?
Nazneen: GPT-3 can be used for almost any task. The key approaches are in-context learning and pre-training. These approaches are particularly effective for tasks like entity linking or extraction, making the model more conversational.
While GPT-3 performs well on traditional NLP tasks like sentiment analysis, conversational models like GPT-4 shine in their ability to engage in interactive conversations and follow instructions. They can perform tasks and format data in specific ways, which sets them apart and makes them exciting.
The real breakthrough moment for generative AI was not GPT-3. Earlier chatbots like Blenderbot from Meta and Microsoft’s chatbots existed but were not as popular due to less refined alignment methodologies. The refinement in approaches like in-context learning and fine-tuning has led to wider adoption and breakthroughs in generative AI.
Aparna: How can these techniques address issues such as model alignment, incorrect content, and privacy concerns?
Nazneen: Techniques like RLHF focus on hallucination and factuality, allowing models to generate “I don’t know” when unsure instead of producing potentially incorrect information. Collecting preferences from domain experts and conducting human evaluations can improve model performance in specific domains. However, ethical concerns regarding privacy and security remain unresolved.
Aparna: I do want to ask you about evaluation. How do I know that the model that I find tuned is actually good? How can I really evaluate my work?
Nazneen: Evaluation is key for a language model because of the stochasticity of the thing. Before I talk about evaluation, I want to first talk about the types of learning or training that goes into these language models. There are four types of learning.
The first is pre training, which is essentially building the foundation model.
The second is in-context learning or in-context training, where no parameters are updated, but you give the model a prompt, and describe a new task that you want the model to achieve. It can either be zero shot, or a few shots. And then you give it a new example. It learns in context.
The third one is supervised fine tuning, which is going from something like GPT3 to instruct GPT. So, you want this foundation model to follow instructions and chat with a human and generate outputs that are actually answers to what the person is looking for or being chatty and being open ended and also following instructions.
The fourth type of training is called reinforcement learning with human feedback. In this case, you first train a reward model based on human preferences. What people have done in the past is, have humans generate a handful of examples, and then ask something like chat GPT to generate more. That is how Alpaca came about and the self instruct data set came about.
For evaluating the first two types of learning, pre-training and in-context learning, we can use benchmarks like the big bench from Google, or the helm benchmarks from Stanford, which are very standard benchmarks of NLP tasks.
During supervised fine tuning, you evaluate for chattiness, whether the language model is actually generating open-ended answers, and also whether it’s actually able to follow instructions. We cannot use these NLP benchmarks here.
We also have to evaluate the reward model to make sure that it has actually learned the values we care about. The things that people generally train the reward model on are truthfulness, harmlessness, and helpfulness. So how good is the response in these dimensions?
Finally, the last part is the very interesting final way to evaluate is called Red Teaming, which comes in the very end. In this case, you’re trying to adversarially attack or prompt the model, and see how it does.
Aparna: What do you think are the defensible sources of differentiation in generative AI?
Nazneen: While generative AI is a crowded field, execution and data quality are key differentiators. Ensuring high-quality data and disentangling hype from reality are crucial. Building models with good data quality can yield better results compared to models trained on noisy data. Investing effort and resources into data quality is essential.
While generative AI is a crowded field, execution and data quality are key differentiators.
Aparna: Lastly, what do you see as the major opportunities for AI in enterprise?
Nazneen: Enterprise AI solutions have major opportunities in leveraging proprietary data for training models. One example is streamlining employee onboarding by automating email exchanges, calendars, and document reading. Workflows in platforms like ServiceNow and Salesforce can also be automated using large language models. The enterprise space offers untapped potential for AI applications by utilizing data and automating various processes.