I'm looking for an algorithm for intersection of two small, unsorted array in very specific condition.

- Type of array item is just integer or integer-like type.
- Significant amount of time (about 30~40%?), one or both array might be empty.
- Arrays are usually very small - usually 1~3 items, I don't expect more than 10.
- The intersection function will be called very frequently.
- I don't care about platform dependent solution - I'm working on x86/windows/C++

Both brute-force/sort-and-intersect solutions are not that bad, but I don't think that they're fast enough. Is there more optimal solution?

In order to determine the intersection of both sets you have to inspect all elements at least once, so that means the most optimal class of solutions yield O(n + m) where n is the number of elements in one set and m the number of elements in the other.

You can achieve that by using a hash table. Given that your items are of type integer, you can count on finding a fast hash function. A simple algorithm would be:

- Iterate first set and add all elements to hash table
- Iterate second set and for each element, check if it exists in the hash table, if so, add it to the intersection set or just print it.

This would be O(n + m) assuming your hashing and your hash lookup are O(1).

Given that you know the sets are frequently empty, you can optimize this by first checking to see if one of the sets is empty, if so, just return an empty set. That's of course assuming that you know the count upfront and can calculate it without iterating the set. If that happens to be the case, you can optimize further by always first reading and hashing the smaller set, ensuring that your hash table memory usage will be the smaller of the two.

As the arrays are of primitive types, and short enough to be in cache lines, a fast implementation would focus on the tactical mechanics of comparisons rather than the big O complexity e.g. avoid hash-tables as these would generally involve hashing and indirection and would always involve a lot of management overhead.

If you have two sorted arrays, then intersection is O(n+m). You say that sort-then-intersect is 'brute-force' but you can't do it quicker.

If the arrays are *stored* sorted, of course, you gain further as you say you are calling the intersection often.

The intersection itself can be done with SSE.

Well, since your arrays are quite small, using insertion sort will be the fastest way to sort these two arrays, C++ STL uses insertion sort for arrays smaller than 16 items as well. Then you can use iterators over these two arrays to compare and intersect the arrays.

There may be other algorithms which would perform faster, however the overhead of these algorithms will probably be too large for 3-4 items per array.

Here's a potential optimization: check whether both arrays have max element <=32 (or 64, or maybe even 16). If they do, then fill two bitmaps of that size (of type `uint32_t`

, etc.) and intersect using a binary AND, `&`

. If they're not, resort to sorting.

Or, instead of sorting, use the highly efficient integer set representation due to Briggs and Torczon that allows linear time intersect with O(*m* + *n*) construction and O(min(*m*, *n*)) intersect. That should be much faster than a hash table with better bounds than sorting.

Similar Questions

I have to write a program in order to find the same numbers between two arrays. The problem is that I have to do it in the most optimized way respecting some constraints: -Having i,j indexes for the a

I have two tab separated files (please see the examples below): File 1 Java RAJ PERL ALEX PYTHON MAurice (and so on) File 2 ALEX 3.4 SAM 8.9 PEPPER 9.0 Now, if for instance say ALEX is also found in

I'm trying to follow the steps found here on comparing two arrays, and knowing when to create a new object, but I just don't understand how it works: You end up with two sorted arrays—one with the em

Suppose we have two finite line segments defined each by two points (in two space). I would like to find a way to get the intersection point of those two lines. Eventually, I would like to extend this

I'm looking for an algorithm to find the common intersection points between 3 spheres. Baring a complete algorithm, a thorough/detailed description of the math would be greatly helpful. This is the on

Let's say I have two binary images of the same size. How do I find the intersection between the two binary images? Only pixels of the same coordinate (location) on the two images that are white (gray

I'm trying to find a way to calculate the intersection between two arcs. I need to use this to determine how much of an Arc is visually on the right half of a circle, and how much on the left. I thoug

I try to find a solution to this problem: I have two arrays A and B of integers (A and B can have different dimensions). I have to find the common elements in these two arrays. I have another conditio

I am playing around with graphics, specifically the intersection and union of primitive shapes. Given the following: Elipse2D e1 = new Elipse2D.Double(120, 80, 80, 80); Elipse2D e2 = new Elipse2D.Doub

I have two continuous ranges and wanna check whether they have any intersection or not in MATLAB. I know it can be achieved by a few if clauses but I wanna know if there is any function in MATLAB to d

Given an unsorted integer array, and without making any assumptions on the numbers in the array: Is it possible to find two numbers whose difference is minimum in O(n) time? Edit: Difference between

When using the usual formulas to calculate intersection between two 2D segments, ie here, if you round the result to an integer, you get non-symmetric results. That is, sometimes, due to rounding erro

I am trying out a simple fill, two intersecting circles with same radius and to fill the intersection alone.Below is what I tried - (void)drawRect:(CGRect)rect { CGContextRef context = UIGraphicsGetCu

Having two arrays: string[] array1 = { Red, blue, green,black }; string[] array2 = { BlUe, yellow, black }; I need only matching strings in array (ignoring case). Result should be: stri

I have the following array structure in Javascript: var unsorted = [ {h:50,t:70}, {h:70,t:60}, {h:30,t:10}, // Ideal Value (Best combination of t being the least and h being the highest ) {h:10,t:30},

I am looking for help to write an efficient PHP algorithm to help me find occurances of a String within another string. Here is currently the situation. I have two arrays. The first array is the array

So I just stared programming in C a few days ago and I have this program which takes an unsorted file full of integers, sorts it using quicksort 1st algorithm Any suggestions on what I have done wrong

I have two rectangles caracterized by 4 values each : Left position X, top position Y, width W and height H: X1, Y1, H1, W1 X2, Y2, H2, W2 Rectangles are not rotated, like so: +--------------------&

I am looking for a function that takes the intersection between two lists and creates a new list, I have this function: let intersect x y = Set.intersect (Set.ofList x) (Set.ofList y) that do what I a

I have a quad type which is defined as: typedef struct __point { float x; float y; } point_t; typedef struct __quad { point_t p1; point_t p2; point_t p3; point_t p4; } quad_t; If I have two of those

What's the simplest, library-free code for implementing array intersections in javascript? I want to write intersection([1,2,3], [2,3,4,5]) and get [2, 3]

I have two tables: all_users and vip_users all_users table has a list of all users (you don't say?) in my system and it currently has around 57k records, while vip_users table has around 37k records.

I have two tables in Sql Server, one containing IDs for files and the slides contained in those original files, and another for sections that can contain slides from one or more of the files, potent

I'm trying to find an algorithm that will compute the intersection between 2 rectangles, which are not necessarily axis-aligned, and return the resulting intersection. This question describes finding

How can one achieve the intersection in hours/min between two timestamps? t1 = 1970-01-01 23:00:00 t2 = 1970-01-02 11:00:00 d1 = 1970-01-01 22:30:00 d2 = 1970-01-02 01:00:00 intersection = 2h:00mins

So I have the following swept AABB collision detection algorithm working all fine and dandy (based on this article) My question is how can I determine the collision normal, or in other words, the side

How do we combine two dfa using intersection method ?

Hey I want to evaluate the performance of index intersection but I'm not able to get an intersection between two indices. I've inserted some dummy records into my DB along this manual. http://docs.mon

I have been trying to find the intersection between two std::set in C++, but I keep getting an error. I created a small sample test for this #include <iostream> #include <vector> #include

I have v1 and v2 , how should I got a new v like below? v1 = {1,2} v2 = {3,4,5} v = {f(1,3) , f(1,4) , f(1,5) f(2,3) ,f(2,4) ,f(2,5)} I know I could do it using two loops, But If there is more idiom

I need to count the number of elements corresponding to the intersection of two big arrays of strings and do it very fast. I am using the following code: arr1[i].Intersect(arr2[j]).Count() For CPU Ti

Are there any standard library calls I can use to either perform set operations on two arrays, or implement such logic myself (ideally as functionally and also efficiently as possible)?

I am trying to populate an array with an intersection of an array and hash based on hash values: array2 = [hi,world,One] VALS = {:one => One, :two => Two} array2 = VALS.values & ar

I need to find out all the combinations of sum of any two numbers in an array. if it is equal then print them. The linear solution to this problem has O(N^2) complexity. I thought of sorting and then

I wrote this code to print the intersection of two arrays ( posting lists) using dev c++ the problem that when I run the program nothing is printed can you help ? I need to know where is the problem a

I'm searching for an algorithm (or a python library) which can give the size of the intersection of multiple 3D volumes. However, the volumes are not known by their equations, I just have a list of po

So, the source data is a sorted dictionary word list, and a list of random unsorted strings. The task is to extract from the list with random strings only the ones that are a combination of two words

I keep getting a null pointer exception. However, the reason eludes me. Can someone help? EDIT: I am using an Integer[][] array to represent null values as infinity symbols as floyd's algorithm is r

Given two regex's, can we write a regex which represents their intersection in each of the following two difference senses, for matching anywhere in a string? Given two regex's expr1 and expr2, can

Here's my problem. There's one unsorted array with 2N elements. All these elements are positive integer. Q: How to split this array into two N array and the sum of two array must be most close to eac

I am trying to match a small array with size of ~20 in an larger array with size of ~200000. Both arrays contains double values. Match in this case means the smallest error, because there won't be an

According to D. M. Mount, the optimal algorithm for line segment intersection reporting problem (monochromatic case) is O(nlogn + k) but for red-blue intersection reporting problem is O(n^4/3 log^O(1)

Does an algorithm exist that finds the maximum of an unsorted array in O(log n) time?

I believe there's a way to find the kth largest element in an unsorted array of length n in O(n). Or perhaps it's expected O(n) or something. How can we do this?

This question already has an answer here: Finding contiguous ranges in arrays 5 answers You are given an Array of numbers and they are unsorted/random order. You are supposed to find the longes

I have this exercise: Write a function, `rec_intersection(rect1, rect2)` and returns the intersection of the two. Rectangles are represented as a pair of coordinate-pairs: the bottom-left and top-righ

Following is an interview question asked by 'Amazon' to me. I still haven't come up with a optimized solution. Problem Statement: Given an unsorted array of integers n. Return 'true' if the addition

I am working on a data import process and need to search different 2 dimensional arrays for a specific code. The array structure look like this: Index ID Code ---------------- 0 34 P1 1 5 AR 2 71 Y2 3

Maybe I'm going insane, but I could have sworn that there was an PHP core function which took two arrays as arguments: $a = array('1', '3'); $b = array('1'=>'apples', '2'=>'oranges', '3'=>'ki

Given a unsorted sequence of a[1,...,n] of integers, give an O(nlogn) runtime algorithm to check there are two indices i and j such that a[i] =2*a[j]. The algorithm should return i=0 and j=2 on input