• 0

[JS] Searching Thought Local JSON File


Question

Imagine I had a local JSON file that looked like this:

var institutes = {
	"colleges":[
		{
			"name": "Massachusetts Institute of Technology",
			"address":{
				"line1": "77 Massachusetts Ave",
				"City": "Cambridge",
				"state": "Massachusetts",
				"zip": "02139"
			},
			"phone":"(617) 253-2139",
			"courses":["Computer Science", "Mathematics", "Physics"]
		},
		{
			"name": "Boston College",
			"address":{
				"line1": "140 Commonwealth Drive",
				"City": "Chestnut Hill",
				"state": "Massachusetts",
				"zip": "(617) 552-8000"
			},
			"phone":"(617) 253-8000",
			"courses":["Law", "Nursing", "Theology"]
		},
		{
			"name": "UC Berkeley College",
			"address":{
				"line1": "320 Mclaughlin Hall",
				"City": "Berkeley",
				"state": "California",
				"zip": "92139"
			},
			"phone":"(510) 642-5771",
			"courses":["Computer Science", "Business Administration", "Theology"]
		}
	]
};

 

 

How would I write a search function that could potentially traverse through the entirety, looking at various keys to grab a query perimeter and return the most relevant objects using JavaScript only? For example, if my function were called search() and I perform the following call:

search("mass");

It would return me something like the following:

{
    "search results":{
		"entities_found": 2,
		"highlights": 4
    },
	"colleges":[
		{
			"name": "<span class='highlight'>Mass</span>achusetts Institute of Technology",
			"address":{
				"line1": "77 <span class='highlight'>Massa</span>chusetts Ave",
				"City": "Cambridge",
				"state": "<span class='highlight'>Mass</span>achusetts",
				"zip": "02139"
			},
			"phone":"(617) 253-2139",
			"courses":["Computer Science", "Mathematics", "Physics"]
		},
		{
			"name": "Boston College",
			"address":{
				"line1": "140 Commonwealth Drive",
				"City": "Chestnut Hill",
				"state": "<span class='highlight'>Mass</span>achusetts",
				"zip": "(617) 552-8000"
			},
			"phone":"(617) 253-8000",
			"courses":["Law", "Nursing", "Theology"]
		}
};

 

 

Equally, if I were to search for the following:

search("Business Administration")

I would get this:

{
    "search results":{
		"entities_found": 1,
		"highlights": 1
    },
	"colleges":[
		{
			"name": "UC Berkeley College",
			"address":{
				"line1": "320 Mclaughlin Hall",
				"City": "Berkeley",
				"state": "California",
				"zip": "92139"
			},
			"phone":"(510) 642-5771",
			"courses":["Computer Science", "<span class='highlight'>Business Administration</span>", "Theology"]
		}
};

 

6 answers to this question

Recommended Posts

  • 0

So.. you're asking people to do your homework/job for you? This is super easy, just figure it out... or maybe don't study/work in programming if you're gonna ask somebody else to do it... I would have helped you if you at least shown you did part of the solution...

Edited by PmRd
  • 0

It's for a personal project. I've been working on it all weekend and I'm not overly impressed with what I've done so I'm kind of embarrassed to show my work. Example with what I've written it doesn't show variation so you type Massachusetts incorrectly for example in your search query it doesn't come up. So it's somewhat works. The area that I'm having particular problem with is restricting what key values to search for. For example if I wanted the user just to search for city name, zip code and courses, I don't know how to restrict it.  Is what I have, don't laugh!

 

var institutes = {
	"colleges":[
		{
			"name": "Massachusetts Institute of Technology",
			"address":{
				"line1": "77 Massachusetts Ave",
				"City": "Cambridge",
				"state": "Massachusetts",
				"zip": "02139"
			},
			"phone":"(617) 253-2139",
			"courses":["Computer Science", "Mathematics", "Physics"]
		},
		{
			"name": "Boston College",
			"address":{
				"line1": "140 Commonwealth Drive",
				"City": "Chestnut Hill",
				"state": "Massachusetts",
				"zip": "(617) 552-8000"
			},
			"phone":"(617) 253-8000",
			"courses":["Law", "Nursing", "Theology"]
		},
		{
			"name": "UC Berkeley College",
			"address":{
				"line1": "320 Mclaughlin Hall",
				"City": "Berkeley",
				"state": "California",
				"zip": "92139"
			},
			"phone":"(510) 642-5771",
			"courses":["Computer Science", "Business Administration", "Theology"]
		}
	]
};


function search(query) {
    let results = {
        "search results": {
            "entities_found": 0,
            "highlights": 0
        },
        "colleges": []
    };

    institutes.colleges.forEach(college => {
        let collegeCopy = JSON.parse(JSON.stringify(college));
        let highlights = recursiveSearch(collegeCopy, query);
        
        if (highlights > 0) {
            results['search results'].entities_found++;
            results['search results'].highlights += highlights;
            results.colleges.push(collegeCopy);
        }
    });

    return results;
}


function recursiveSearch(obj, query) {
    let totalHighlights = 0;

    for (let key in obj) {
        if (typeof obj[key] === 'string' && obj[key].toLowerCase().includes(query.toLowerCase())) {
            obj[key] = highlight(obj[key], query);
            totalHighlights++;
        } else if (typeof obj[key] === 'object') {
            totalHighlights += recursiveSearch(obj[key], query);
        }
    }

    return totalHighlights;
}


function highlight(text, query) {
    const regex = new RegExp(`(${query})`, 'ig');
    return text.replace(regex, "<span class='highlight'>$1</span>");
}


console.log(search("mass"));

 

  • 0

I've updated the code and I'm now utilizing the Levenstein distance algorithm attempt to perform some kind of fuzzy logic on typos. It's still not perfect, as you can see in the last example. I truly appreciate your help.

 

var institutes = {
    "colleges": [
        {
            "name": "Massachusetts Institute of Technology",
            "address": {
                "line1": "77 Massachusetts Ave",
                "city": "Cambridge",
                "state": "Massachusetts",
                "zip": "02139"
            },
            "phone": "(617) 253-2139",
            "courses": ["Computer Science", "Mathematics", "Physics"]
        },
        {
            "name": "Boston College",
            "address": {
                "line1": "140 Commonwealth Drive",
                "city": "Chestnut Hill",
                "state": "Massachusetts",
                "zip": "(617) 552-8000"
            },
            "phone": "(617) 253-8000",
            "courses": ["Law", "Nursing", "Theology"]
        },
        {
            "name": "UC Berkeley College",
            "address": {
                "line1": "320 Mclaughlin Hall",
                "city": "Berkeley",
                "state": "California",
                "zip": "92139"
            },
            "phone": "(510) 642-5771",
            "courses": ["Computer Science", "Business Administration", "Theology"]
        }
    ]
};


function getLevenshteinDistance(a, b) {
    if (a === b) return 0;
    if (a.length === 0) return b.length;
    if (b.length === 0) return a.length;

    const matrix = [];

    for (let i = 0; i <= b.length; i++) {
        matrix[i] = [i];
    }

    for (let j = 0; j <= a.length; j++) {
        matrix[0][j] = j;
    }

    for (let i = 1; i <= b.length; i++) {
        for (let j = 1; j <= a.length; j++) {
            if (b.charAt(i - 1) === a.charAt(j - 1)) {
                matrix[i][j] = matrix[i - 1][j - 1];
            } else {
                matrix[i][j] = Math.min(matrix[i - 1][j - 1] + 1, Math.min(matrix[i][j - 1] + 1, matrix[i - 1][j] + 1));
            }
        }
    }

    return matrix[b.length][a.length];
}

function isSimilar(str1, str2) {
    const threshold = 2;
    const distance = getLevenshteinDistance(str1.toLowerCase(), str2.toLowerCase());
    return distance <= threshold || str1.toLowerCase().includes(str2.toLowerCase());
}


function processCollege(college, query, searchKeys) {
    let matched = false;

    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && college[key].toLowerCase().includes(query.toLowerCase())) {
            matched = true;
            college[key] = college[key].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
        } else if (key === "courses" && Array.isArray(college[key])) {
            const matchedCourses = college[key].filter(course => course.toLowerCase().includes(query.toLowerCase()));
            if (matchedCourses.length > 0) {
                matched = true;
                college[key] = matchedCourses.map(course => course.replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`));
            }
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && college[key][subKey].toLowerCase().includes(query.toLowerCase())) {
                    matched = true;
                    college[key][subKey] = college[key][subKey].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
                }
            }
        }
    }

    return matched ? college : null;
}


function fuzzySearch(str, query) {
    const threshold = Math.ceil(query.length / 2);
    const normalizedStr = str.toLowerCase();
    const normalizedQuery = query.toLowerCase();

    return normalizedStr.includes(normalizedQuery) || levenshtein(normalizedStr, normalizedQuery) <= threshold;
}


function highlightMatches(str, query) {
    const regex = new RegExp(query, 'gi');
    return str.replace(regex, match => `<span class='highlight'>${match}</span>`);
}


function search(query, searchKeys) {
    let results = [];

    // Attempt to find exact matches...
    for (let college of institutes.colleges) {
        const processedCollege = processCollege({ ...college }, query, searchKeys);
        if (processedCollege) {
            results.push(processedCollege);
        }
    }

    // If no exact matches were found, try fuzzy search...
    if (results.length === 0) {
        for (let college of institutes.colleges) {
            const processedCollege = processCollegeFuzzy({ ...college }, query, searchKeys);
            if (processedCollege) {
                results.push(processedCollege);
            }
        }
    }

    return results;
}


function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}


function isFuzzyMatch(str1, str2, threshold) {
    return getLevenshteinDistance(str1.toLowerCase(), str2.toLowerCase()) <= threshold;
}



function processCollegeWithLevenshtein(college, query, searchKeys) {
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isSimilarLevenshtein(college[key], query)) {
            college[key] = college[key].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isSimilarLevenshtein(college[key][subKey], query)) {
                    college[key][subKey] = college[key][subKey].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
                    return college;
                }
            }
        }
    }
    return null;
}


// SUCCESS: Finds "Berkeley"
console.log(search("berkly", ["name", "city", "state", "zip", "courses"]));

// SUCCESS: Finds all Massachusetts colleges...
console.log(search("mass", ["name", "city", "state", "zip", "courses"]));

// FAIL: Deliberate misspelling of the text "Theology" and did not give any results..
console.log(search("thiology", ["name", "city", "state", "zip", "courses"]));

 

  • 0

Not going to comment on the overall structure of the code but the reason the thiology search isn't working in your example is because you're not searching arrays. 

In your:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

It does loop through the array (because an array is an object in js), but it doesn't check the values because searchKeys doesn't contain the array indices, i.e. 0, 1, 2, 3, ...

As a quick workaround without changing too much you could just check if the key is an array too:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if ((searchKeys.includes(subKey) || Array.isArray(college[key])) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
            console.log(subKey);
                
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

 

Edited by ZakO
  • Like 1
  • 0
On 14/11/2023 at 03:38, ZakO said:

Not going to comment on the overall structure of the code but the reason the thiology search isn't working in your example is because you're not searching arrays. 

In your:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

It does loop through the array (because an array is an object in js), but it doesn't check the values because searchKeys doesn't contain the array indices, i.e. 0, 1, 2, 3, ...

As a quick workaround without changing too much you could just check if the key is an array too:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if ((searchKeys.includes(subKey) || Array.isArray(college[key])) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
            console.log(subKey);
                
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

 

Thank you. I am new to JavaScritp, I've nor been doing it too long and I'm still learning. Thank you for guiding me

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • Lol I had one of these turn faulty in Jan, guess it wasn't just bad luck lol
    • I'm team Rossmann all the way. I have the exact same NVME, altough not in an array like him.
    • It had gone weeks ago. Although thinking about it I'm on the beta.
    • They thought value of their goods would forever only drop like it used to and didn't account for sudden increase in price because of all the Ai hype. Tough luck Samsung, don't try to weasel this one out. Also American customer protection laws are a**. In Europe, you need to be compensated for a functioning product of same or better characteristics (not same price point as when it was originally bought!) if it can't be repaired and when you receive a replacement product your warranty starts from scratch because you received a different item than you previously had and old warranty thus cannot apply to it anymore. If your actual item was successfully repaired, warranty gets extended for the period the item was in service. If item is repaired to a significant extent, warranty also starts over from scratch because major part of it was replaced. Americans need to fight to get this kind of consumer protections because they are constantly getting screwed over.
    • Microsoft releases new Windows 11 Media Creation Tool with the latest updates by Taras Buria Patch Tuesday updates arrive every month, bringing users new features and security updates. To make sure customers have access to the most recent images, Microsoft also releases updates to the Media Creation Tool app, its official utility for Windows 11 installation. Today, the company pushed new ISOs to Media Creation Tool, allowing you to create images with the June 2026 Patch Tuesday updates. With the latest update, the Media Creation Tool now downloads KB5094126. It is Windows 11 version 25H2, build 26200.8655, which is also available via Windows Update. Note that the app itself remains on the previous version, which you can check in Properties > Details. The only change is that it now downloads a more recent Windows 11 build, so the only way to check is to download an ISO. The June 2026 Patch Tuesday update is a special release for Windows 11, as it brings a new performance profile to make the operating system more responsive and snappier when rendering various user interface surfaces, including the Start menu, quick settings, and more. It does so by spiking processor speeds for a brief moment, resulting in higher loads for a second or two. The so-called “Low latency profile” is rolling out gradually, but you can force-enable it with the ViVeTool app. Other changes include webcam improvements, Task Manager updates, shared audio support, and more. You can download the Media Creation Tool app from the official Microsoft website using this link. Besides MCT, Microsoft lets you download Windows 11 ISO as a file directly from the official Windows 11 website. However, you will need a third-party app to write it to your USB drive. Check out this guide if you want to know how to do that.
  • Recent Achievements

    • Week One Done
      davidbazooked earned a badge
      Week One Done
    • One Month Later
      Jamswaz earned a badge
      One Month Later
    • Week One Done
      Jamswaz earned a badge
      Week One Done
    • Rookie
      Marzoid went up a rank
      Rookie
    • Community Regular
      coch went up a rank
      Community Regular
  • Popular Contributors

    1. 1
      +primortal
      509
    2. 2
      PsYcHoKiLLa
      184
    3. 3
      +Edouard
      158
    4. 4
      Steven P.
      83
    5. 5
      ATLien_0
      75
  • Tell a friend

    Love Neowin? Tell a friend!