Quantcast
Channel: BitFunnel
Browsing all 15 articles
Browse latest View live

Image may be NSFW.
Clik here to view.

Index Build Tools

NOTE: This page was updated on 9/19/16 to reflect significant changes in the index build tools. After many months of hard work, we kind of, sort of have a document ingestion pipeline that seems to...

View Article



Image may be NSFW.
Clik here to view.

Getting started with NativeJIT

NativeJIT is a just-in-time compiler that handles expressions involving C data structures. It was originally developed in Bing, with the goal of being able to compile search query matching and search...

View Article

Stream Configuration

BitFunnel models each document as a set of streams, each of which consists of a sequence of terms corresponding to the words and phrases that make up the document.Real world documents are usually...

View Article

A Small Query Language

A challenge in bringing BitFunnel to open source is providing functionality that was previously supplied by portions of Bing upstream of BitFunnel. BitFunnel was designed as a library that takes, as...

View Article

Image may be NSFW.
Clik here to view.

BitFunnel performance estimation

.slide {border: 1px solid;} Hi! I’m going to talk about two things today.First, I’m going to talk about one way to think about performance. That is, one way you can reason about performance. Second,...

View Article


Sample Data

I’ve been trying to make it really easy to get started with BitFunnel, but we still have a ways to go. From the beginning we put a lot of effort into ensuring our code would build and run on Linux,...

View Article

Searching for Primes

What do prime numbers have to do with BitFunnel?It turns out we use them to test our matching engine. One of the challenges in bringing up a new search engine is figuring out how to test it. If you...

View Article

Image may be NSFW.
Clik here to view.

All's Well That Ends Well

We’ve been having some stability problems of late. In our rush to get some minimal version of the document ingestion pipeline up and running, we created a number of tools for gathering corpus...

View Article


Image may be NSFW.
Clik here to view.

When will BitFunnel be usable?

How long should we expect this project to take? In theory, we should have a relatively easy time guessing how long this project will take because this project is a half-port-half-rewrite whose aim to...

View Article


Debugging an SEH Crash

Here’s a video showing how I debugged a read access violation that was caused by an earlier buffer overflow. This sort of problem can sometimes be hard to track down, but in this case, a data...

View Article

How do make onboarding to BitFunnel easier?

I’ve been working on BitFunnel for roughly six months now. If I look at how I’ve used that time, my guess is that I’ve taken about a month of Mike’s time. If you look at the progress we’ve made, I...

View Article

BitFunnel Glossary

To get a high level overview of the algorithm, please see this talk transcript. This glossary is incomplete and needs a lot of work! While our plan is to fill out the whole thing, that will probably...

View Article

Wikipedia as test corpus for BitFunnel

Wikipedia is a great test corpus for search engines. It is free and easy to obtain, it carries a license appropriate for research, and at ~59GB uncompressed, it is large, but not too large to fit on a...

View Article


Image may be NSFW.
Clik here to view.

Row Table Analysis

I spent the weekend implementing code to analyze bit densities in the rows and columns of the row tables. This tool should help us determine whether the row tables are configured correctly. A good row...

View Article

Image may be NSFW.
Clik here to view.

Debugging Bit Densities

Things are starting to get exciting in the Land of BitFunnel. We’re now at the point where we can ingest a significant fraction of Wikipedia and run millions of queries, all without crashing – and we...

View Article

Browsing all 15 articles
Browse latest View live




Latest Images