# How do I make approximate searches from a set with multiple dimensions?

Let's say I have Items, a large set of these objects:

```class Item
{
public float Cost;
public float Size;
public float Weight;
public float Temperature;
}
```

I would like to repeatedly pick an object from this set that closely matches a given input.

```Item PickItem(float cost, float size, float weight, float temperature) {...}
```

It should return an Item from Items that is close to the given inputs. It does not have to give the optimal answer; any close value, maybe even with some random variation, would be fine.

How would I go about doing this? I've been doing some research and it looks like using Principle Component Analysis (PCA) would let me group my set into similar regions, but it looks complex to implement. Also, I'm not sure how I would choose an item using that, other than adding the search term to the set and finding a nearby resulting value, but that would require rebuilding the whole PCA system for each search.

Is there a simpler way? I could just search for the element with the smallest average difference, but it seems like I would end up with results that are in between everything, instead of results that are mostly correlated but with an outlier property or two. I guess it could work if I weighted it?

This is O(n/2)

```Item PickItem(float cost, float size, float weight, float temperature)
{
var bestDiff = float.MaxValue;
Item bestItem = null;
foreach(var item in items)
{
var diff = CaluclateDifference(item, cost, size, weight, temperature);
if(diff < bestDiff)
{
bestDiff = diff;
bestItem = item;
if(bestDiff = 0)
return bestItem;
}
}
return bestItem;
}

static float CaluclateDifference(Item item, float cost, float size, float weight, float temperature) =>
Math.Abs(item.Cost - cost) +
Math.Abs(item.Size - size) +
Math.Abs(item.Weight - weight) +
Math.Abs(item.Temperature- temperature);
```

### Jersey Async ContainerRequestFilter

I have a Jersey REST API and am using a ContainerRequestFilter to handle authorization. I'm also using @ManagedAsync on all endpoints so that my API can serve thousands of concurrent requests.

### Split dataset without separating records with common attribute

I have a large data frame of authors and corresponding texts (approximately 450,000 records). From the data frame I extracted two vectors respectively for authors and texts, such as: