• 0

[JS] Searching Thought Local JSON File


Question

Imagine I had a local JSON file that looked like this:

var institutes = {
	"colleges":[
		{
			"name": "Massachusetts Institute of Technology",
			"address":{
				"line1": "77 Massachusetts Ave",
				"City": "Cambridge",
				"state": "Massachusetts",
				"zip": "02139"
			},
			"phone":"(617) 253-2139",
			"courses":["Computer Science", "Mathematics", "Physics"]
		},
		{
			"name": "Boston College",
			"address":{
				"line1": "140 Commonwealth Drive",
				"City": "Chestnut Hill",
				"state": "Massachusetts",
				"zip": "(617) 552-8000"
			},
			"phone":"(617) 253-8000",
			"courses":["Law", "Nursing", "Theology"]
		},
		{
			"name": "UC Berkeley College",
			"address":{
				"line1": "320 Mclaughlin Hall",
				"City": "Berkeley",
				"state": "California",
				"zip": "92139"
			},
			"phone":"(510) 642-5771",
			"courses":["Computer Science", "Business Administration", "Theology"]
		}
	]
};

 

 

How would I write a search function that could potentially traverse through the entirety, looking at various keys to grab a query perimeter and return the most relevant objects using JavaScript only? For example, if my function were called search() and I perform the following call:

search("mass");

It would return me something like the following:

{
    "search results":{
		"entities_found": 2,
		"highlights": 4
    },
	"colleges":[
		{
			"name": "<span class='highlight'>Mass</span>achusetts Institute of Technology",
			"address":{
				"line1": "77 <span class='highlight'>Massa</span>chusetts Ave",
				"City": "Cambridge",
				"state": "<span class='highlight'>Mass</span>achusetts",
				"zip": "02139"
			},
			"phone":"(617) 253-2139",
			"courses":["Computer Science", "Mathematics", "Physics"]
		},
		{
			"name": "Boston College",
			"address":{
				"line1": "140 Commonwealth Drive",
				"City": "Chestnut Hill",
				"state": "<span class='highlight'>Mass</span>achusetts",
				"zip": "(617) 552-8000"
			},
			"phone":"(617) 253-8000",
			"courses":["Law", "Nursing", "Theology"]
		}
};

 

 

Equally, if I were to search for the following:

search("Business Administration")

I would get this:

{
    "search results":{
		"entities_found": 1,
		"highlights": 1
    },
	"colleges":[
		{
			"name": "UC Berkeley College",
			"address":{
				"line1": "320 Mclaughlin Hall",
				"City": "Berkeley",
				"state": "California",
				"zip": "92139"
			},
			"phone":"(510) 642-5771",
			"courses":["Computer Science", "<span class='highlight'>Business Administration</span>", "Theology"]
		}
};

 

6 answers to this question

Recommended Posts

  • 0

So.. you're asking people to do your homework/job for you? This is super easy, just figure it out... or maybe don't study/work in programming if you're gonna ask somebody else to do it... I would have helped you if you at least shown you did part of the solution...

Edited by PmRd
  • 0

It's for a personal project. I've been working on it all weekend and I'm not overly impressed with what I've done so I'm kind of embarrassed to show my work. Example with what I've written it doesn't show variation so you type Massachusetts incorrectly for example in your search query it doesn't come up. So it's somewhat works. The area that I'm having particular problem with is restricting what key values to search for. For example if I wanted the user just to search for city name, zip code and courses, I don't know how to restrict it.  Is what I have, don't laugh!

 

var institutes = {
	"colleges":[
		{
			"name": "Massachusetts Institute of Technology",
			"address":{
				"line1": "77 Massachusetts Ave",
				"City": "Cambridge",
				"state": "Massachusetts",
				"zip": "02139"
			},
			"phone":"(617) 253-2139",
			"courses":["Computer Science", "Mathematics", "Physics"]
		},
		{
			"name": "Boston College",
			"address":{
				"line1": "140 Commonwealth Drive",
				"City": "Chestnut Hill",
				"state": "Massachusetts",
				"zip": "(617) 552-8000"
			},
			"phone":"(617) 253-8000",
			"courses":["Law", "Nursing", "Theology"]
		},
		{
			"name": "UC Berkeley College",
			"address":{
				"line1": "320 Mclaughlin Hall",
				"City": "Berkeley",
				"state": "California",
				"zip": "92139"
			},
			"phone":"(510) 642-5771",
			"courses":["Computer Science", "Business Administration", "Theology"]
		}
	]
};


function search(query) {
    let results = {
        "search results": {
            "entities_found": 0,
            "highlights": 0
        },
        "colleges": []
    };

    institutes.colleges.forEach(college => {
        let collegeCopy = JSON.parse(JSON.stringify(college));
        let highlights = recursiveSearch(collegeCopy, query);
        
        if (highlights > 0) {
            results['search results'].entities_found++;
            results['search results'].highlights += highlights;
            results.colleges.push(collegeCopy);
        }
    });

    return results;
}


function recursiveSearch(obj, query) {
    let totalHighlights = 0;

    for (let key in obj) {
        if (typeof obj[key] === 'string' && obj[key].toLowerCase().includes(query.toLowerCase())) {
            obj[key] = highlight(obj[key], query);
            totalHighlights++;
        } else if (typeof obj[key] === 'object') {
            totalHighlights += recursiveSearch(obj[key], query);
        }
    }

    return totalHighlights;
}


function highlight(text, query) {
    const regex = new RegExp(`(${query})`, 'ig');
    return text.replace(regex, "<span class='highlight'>$1</span>");
}


console.log(search("mass"));

 

  • 0

I've updated the code and I'm now utilizing the Levenstein distance algorithm attempt to perform some kind of fuzzy logic on typos. It's still not perfect, as you can see in the last example. I truly appreciate your help.

 

var institutes = {
    "colleges": [
        {
            "name": "Massachusetts Institute of Technology",
            "address": {
                "line1": "77 Massachusetts Ave",
                "city": "Cambridge",
                "state": "Massachusetts",
                "zip": "02139"
            },
            "phone": "(617) 253-2139",
            "courses": ["Computer Science", "Mathematics", "Physics"]
        },
        {
            "name": "Boston College",
            "address": {
                "line1": "140 Commonwealth Drive",
                "city": "Chestnut Hill",
                "state": "Massachusetts",
                "zip": "(617) 552-8000"
            },
            "phone": "(617) 253-8000",
            "courses": ["Law", "Nursing", "Theology"]
        },
        {
            "name": "UC Berkeley College",
            "address": {
                "line1": "320 Mclaughlin Hall",
                "city": "Berkeley",
                "state": "California",
                "zip": "92139"
            },
            "phone": "(510) 642-5771",
            "courses": ["Computer Science", "Business Administration", "Theology"]
        }
    ]
};


function getLevenshteinDistance(a, b) {
    if (a === b) return 0;
    if (a.length === 0) return b.length;
    if (b.length === 0) return a.length;

    const matrix = [];

    for (let i = 0; i <= b.length; i++) {
        matrix[i] = [i];
    }

    for (let j = 0; j <= a.length; j++) {
        matrix[0][j] = j;
    }

    for (let i = 1; i <= b.length; i++) {
        for (let j = 1; j <= a.length; j++) {
            if (b.charAt(i - 1) === a.charAt(j - 1)) {
                matrix[i][j] = matrix[i - 1][j - 1];
            } else {
                matrix[i][j] = Math.min(matrix[i - 1][j - 1] + 1, Math.min(matrix[i][j - 1] + 1, matrix[i - 1][j] + 1));
            }
        }
    }

    return matrix[b.length][a.length];
}

function isSimilar(str1, str2) {
    const threshold = 2;
    const distance = getLevenshteinDistance(str1.toLowerCase(), str2.toLowerCase());
    return distance <= threshold || str1.toLowerCase().includes(str2.toLowerCase());
}


function processCollege(college, query, searchKeys) {
    let matched = false;

    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && college[key].toLowerCase().includes(query.toLowerCase())) {
            matched = true;
            college[key] = college[key].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
        } else if (key === "courses" && Array.isArray(college[key])) {
            const matchedCourses = college[key].filter(course => course.toLowerCase().includes(query.toLowerCase()));
            if (matchedCourses.length > 0) {
                matched = true;
                college[key] = matchedCourses.map(course => course.replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`));
            }
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && college[key][subKey].toLowerCase().includes(query.toLowerCase())) {
                    matched = true;
                    college[key][subKey] = college[key][subKey].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
                }
            }
        }
    }

    return matched ? college : null;
}


function fuzzySearch(str, query) {
    const threshold = Math.ceil(query.length / 2);
    const normalizedStr = str.toLowerCase();
    const normalizedQuery = query.toLowerCase();

    return normalizedStr.includes(normalizedQuery) || levenshtein(normalizedStr, normalizedQuery) <= threshold;
}


function highlightMatches(str, query) {
    const regex = new RegExp(query, 'gi');
    return str.replace(regex, match => `<span class='highlight'>${match}</span>`);
}


function search(query, searchKeys) {
    let results = [];

    // Attempt to find exact matches...
    for (let college of institutes.colleges) {
        const processedCollege = processCollege({ ...college }, query, searchKeys);
        if (processedCollege) {
            results.push(processedCollege);
        }
    }

    // If no exact matches were found, try fuzzy search...
    if (results.length === 0) {
        for (let college of institutes.colleges) {
            const processedCollege = processCollegeFuzzy({ ...college }, query, searchKeys);
            if (processedCollege) {
                results.push(processedCollege);
            }
        }
    }

    return results;
}


function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}


function isFuzzyMatch(str1, str2, threshold) {
    return getLevenshteinDistance(str1.toLowerCase(), str2.toLowerCase()) <= threshold;
}



function processCollegeWithLevenshtein(college, query, searchKeys) {
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isSimilarLevenshtein(college[key], query)) {
            college[key] = college[key].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isSimilarLevenshtein(college[key][subKey], query)) {
                    college[key][subKey] = college[key][subKey].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
                    return college;
                }
            }
        }
    }
    return null;
}


// SUCCESS: Finds "Berkeley"
console.log(search("berkly", ["name", "city", "state", "zip", "courses"]));

// SUCCESS: Finds all Massachusetts colleges...
console.log(search("mass", ["name", "city", "state", "zip", "courses"]));

// FAIL: Deliberate misspelling of the text "Theology" and did not give any results..
console.log(search("thiology", ["name", "city", "state", "zip", "courses"]));

 

  • 0

Not going to comment on the overall structure of the code but the reason the thiology search isn't working in your example is because you're not searching arrays. 

In your:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

It does loop through the array (because an array is an object in js), but it doesn't check the values because searchKeys doesn't contain the array indices, i.e. 0, 1, 2, 3, ...

As a quick workaround without changing too much you could just check if the key is an array too:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if ((searchKeys.includes(subKey) || Array.isArray(college[key])) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
            console.log(subKey);
                
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

 

Edited by ZakO
  • Like 1
  • 0
On 14/11/2023 at 03:38, ZakO said:

Not going to comment on the overall structure of the code but the reason the thiology search isn't working in your example is because you're not searching arrays. 

In your:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

It does loop through the array (because an array is an object in js), but it doesn't check the values because searchKeys doesn't contain the array indices, i.e. 0, 1, 2, 3, ...

As a quick workaround without changing too much you could just check if the key is an array too:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if ((searchKeys.includes(subKey) || Array.isArray(college[key])) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
            console.log(subKey);
                
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

 

Thank you. I am new to JavaScritp, I've nor been doing it too long and I'm still learning. Thank you for guiding me

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • Brave Browser 1.91.171 by Razvan Serea Brave Browser is a lightning-fast, secure web browser that stands out from the competition with its focus on privacy, security, and speed. With features like HTTPS Everywhere and built-in tracker blocking, Brave keeps your online activities safe from prying eyes. Brave is one of the safest browsers on the market today. It blocks third-party data storage. It protects from browser fingerprinting. And it does all this by default. Speed - Brave is built on Chromium, the same technology that powers Google Chrome, and is optimized for speed, providing a fast and responsive browsing experience. Brave Browser also features Brave Rewards, a system that rewards users with Basic Attention Tokens (BAT) for viewing opt-in ads. This innovative system provides an alternative revenue model for content creators and a way to support the Brave community. SlimBrave Neo takes all the good things about Brave and makes them even better by keeping everything clean, light, and privacy-focused. It removes the extra clutter, turns off features you might not need, and cuts down on anything that could slow you down or collect unnecessary data. Because it relies on simple settings and policies instead of modifying the browser itself, you still get full Brave compatibility—just in a smoother, lighter, and more privacy-friendly package. Brave Browser 1.91.171 changelog: General Fixed Cardano not being disabled on upgrade to Brave Origin. Upgraded Chromium to 149.0.7827.103. Origin Removed “Survey Panelist” setting from brave://settings/privacy. Fixed P3A and usage ping under brave://settings/privacy being displayed on first launch on Linux. Upgraded Chromium to 149.0.7827.103. Download: Brave Browser 64-bit | 1.2 MB (Freeware) Download: Brave Browser 32-bit View: Brave Homepage | Offline Installers | Screenshot Get alerted to all of our Software updates on Twitter at @NeowinSoftware
    • Hi. As the title suggests, I can't access the forum on my phone. I'm using Edge on Android and when I try to navigate to the forum I get a "we value your privacy" popup and none of the buttons are clickable. It effectively stonewalls me from reading any forum content.
    • Honestly you're not wrong about AdGuard. Neowin frequently has lifetime license discounts for them and that's how I got my cheap family license a few years ago to run it on all my devices.
  • Recent Achievements

    • One Year In
      slackerzz earned a badge
      One Year In
    • One Year In
      highriskpaym earned a badge
      One Year In
    • One Month Later
      highriskpaym earned a badge
      One Month Later
    • Week One Done
      highriskpaym earned a badge
      Week One Done
    • Week One Done
      FBSPL earned a badge
      Week One Done
  • Popular Contributors

    1. 1
      +primortal
      521
    2. 2
      PsYcHoKiLLa
      197
    3. 3
      +Edouard
      157
    4. 4
      Steven P.
      84
    5. 5
      ATLien_0
      75
  • Tell a friend

    Love Neowin? Tell a friend!