Haystack: Open-Source AI Framework for Production Ready Agents, RAG

(haystack.deepset.ai)

66 points | by doener 6 hours ago

7 comments

  • bane 2 hours ago
    People who name projects, please think very carefully if you want to use "Haystack" as the name of anything. There are literally thousands upon thousands of both overlapping and completely different products, projects, bands, initiatives, efforts, and so on with that name and all the possible variants you can think of (Haystax, Heystack, Heystax, Hay Stak, and so on).

    And no, you probably won't be the first project with that name in whatever market/vertical/milieu that you are working in.

    I found half a dozen different "Haystack" products and companies working in AI in 10 seconds of googling.

    Please make it stop.

    • skinfaxi 2 hours ago
      The irony is strong with this one. Should have called it "Needle".
    • vivzkestrel 1 hour ago
      - this is what happens when you dont know how SEO works and how to brand

      - could have used a unique name like agistack or llmax or supagpt or something like this

      - that way you get to redirect all traffic to your website 100%

  • bitlad 2 hours ago
    > Haystack collects anonymous usage statistics of pipeline components. We receive an event every time these components are initialized. This way, we know which components are most relevant to our community.

    For an EU based company, this stands out.

  • throwaw12 5 hours ago
    its good that there is a competition in framework space, but does anyone have holistic view or opinions about their differences and where they shine?

    For example,

    * there is LangChain and LangGraph - used a lot, but framework bloat is hated as well

    * mastra - for typescript projects

    * pydantic, agno, strands, openai agents sdk, claude agents sdk, and so on and on and on

    • everforward 2 hours ago
      > * there is LangChain and LangGraph - used a lot, but framework bloat is hated as well

      I've used them a fair bit, and I'm not a huge fan. Only self-hosted, I can't comment on their cloud-SaaS agent runner thing. The observability looks neat, though.

      LangChain is nice enough, I appreciate having a unified API across providers. LangGraph is... just not all that much? As in the DAG is too much for a simple agent, but when I start thinking about a large agent and dealing with that flow in their their DAG DSL my head starts to hurt. "Go To Definition" isn't going to help navigate that very well, the state is going to be a lot of Optional's with not a lot of info on when they have a real value at which stages in the DAG.

      I had substantial issues with branches in my DAG because the state has to include all possible fields for every step. It gets hard to mentally track all the combinations of fields that will be present or missing depending on the path taken upstream of this node. Do I have RAG results? Not sure, it depends on whether the query includes X, but then later it also depends on whether a tool returns a particular result, in which case it can either be missing, have a single value or have 2 values. Yada yada, in a sufficiently large DAG it gets hard to track those. Things are much cleaner in function world where you can declare "this functions require X and Y, and can optionally provide Z".

      I mostly go directly to the API these days, but I'm fairly settled on Ollama. I might use LangChain if I think I'll want another backend, but I also might use OpenRouter. I haven't yet, but it seems cool.

    • sgc 3 hours ago
      I am very curious about this as well. I'm looking for something that does really well with workflows that require 20 plus steps including a couple while loops and user verifications, but also something simple like a chat bot with access to MCP servers and tools. I need to be able to use Gemini directly, openAI directly, and openrouter at various steps.

      Right now I am having trouble deciding whether it's better to just write my own harness, or use langchain, or something else.

      • reactordev 3 hours ago
        You can do all of that with current frameworks depending on how deep you go. The issue people are finding is that one size does not fit all and your specific workflows may be better suited for a lower level.
        • sgc 2 hours ago
          What I find impossible to judge is whether me choosing the harness that works best for me and the way I like to work will limit the quality of the LLM output.

          In this case, given the complexity of LangChain I don't know if it would burn a lot of tokens or confuse the LLMs with context creep due to the large env and tooling compared to something much simpler, or would something much simpler burn a lot of tokens and stutter in execution compared to LangChain, because they have a lot of middleware optimizations that I would have to relearn the hard way? Or are both those strategies off the mark and there is a better tool for the job I am not even thinking about?

          It gets a bit expensive in terms of both time/effort and money to experiment on a full workflow rather than specific steps which is quite easy in openrouter, hence my curiosity as to others' experiences.

    • nijave 4 hours ago
      > Strands

      I like it quite a bit. Imo it's sort of like the "Flask" of frameworks. It's pretty easy to get started with and has a pretty pluggable ecosystem where you can choose models, providers, tools without much lock-in.

      It has AWS weight behind it (for better or worse) and has a slight skew towards AWS solutions but it's still trivial to skip those.

      Krew.ai is another one not on your list. It's fairly comprehensive but it was too "enterprisy" and too heavy on proprietary jargon for all the components imo. I think the tech is ok but the docs and terminology is very dense.

      I also briefly looked at LangGraph and it's fairly straight forward and provides a lot of the same abstractions and interfaces as Strands but it felt too easy to accidentally miss them and end up reinventing the wheel. Put another way, it was easy to accidentally miss the abstraction and wade into DIY primitive territory. That may be a feature to some, but I thought it would be a distraction for quick adoption at a small company.

      When I checked a few months ago, LangGraph, Krew, and OpenAI Agents SDK had better out of the box integration with common observability/monitoring/tracing solutions. Strands has pretty robust Otel support but not all providers had implemented the Otel AI spec at the time so it required some adapters/wrapper code (plenty of community examples online, just slightly less plug and play)

      At the time, I don't think Claude Agents SDK existed yet or it was very new. I skipped OpenAI SDK because we wanted a multi-vendor solution. Pydantic Agents was too new so I skipped that

    • fartcoin67 1 hour ago
      [dead]
  • isawczuk 3 hours ago
    I've experience using it with clients on several small to large projects. It's has advantages and disadvantages, as every framework.

    Clients choose it because it's EU-based company.

    • zzleeper 2 hours ago
      Would you have used something else without that constraint?
  • alansaber 1 hour ago
    I remember haystack being completely unusable for extractive QA 2 years ago. I wonder if it's the same package.
    • bpiche 52 minutes ago
      Likewise
  • piratebroadcast 4 hours ago
    RubyLLM is where its at.
    • tnzk 4 hours ago
      Why Ruby? It seems to be a Python framework.
  • tw1984 5 hours ago
    deepset? more like deadseek.

    no, thanks, never going to use that.

    • srameshc 5 hours ago
      > deepset? more like deadseek. Just curious, why say that ?