• 0

[JS] Searching Thought Local JSON File


Question

Imagine I had a local JSON file that looked like this:

var institutes = {
	"colleges":[
		{
			"name": "Massachusetts Institute of Technology",
			"address":{
				"line1": "77 Massachusetts Ave",
				"City": "Cambridge",
				"state": "Massachusetts",
				"zip": "02139"
			},
			"phone":"(617) 253-2139",
			"courses":["Computer Science", "Mathematics", "Physics"]
		},
		{
			"name": "Boston College",
			"address":{
				"line1": "140 Commonwealth Drive",
				"City": "Chestnut Hill",
				"state": "Massachusetts",
				"zip": "(617) 552-8000"
			},
			"phone":"(617) 253-8000",
			"courses":["Law", "Nursing", "Theology"]
		},
		{
			"name": "UC Berkeley College",
			"address":{
				"line1": "320 Mclaughlin Hall",
				"City": "Berkeley",
				"state": "California",
				"zip": "92139"
			},
			"phone":"(510) 642-5771",
			"courses":["Computer Science", "Business Administration", "Theology"]
		}
	]
};

 

 

How would I write a search function that could potentially traverse through the entirety, looking at various keys to grab a query perimeter and return the most relevant objects using JavaScript only? For example, if my function were called search() and I perform the following call:

search("mass");

It would return me something like the following:

{
    "search results":{
		"entities_found": 2,
		"highlights": 4
    },
	"colleges":[
		{
			"name": "<span class='highlight'>Mass</span>achusetts Institute of Technology",
			"address":{
				"line1": "77 <span class='highlight'>Massa</span>chusetts Ave",
				"City": "Cambridge",
				"state": "<span class='highlight'>Mass</span>achusetts",
				"zip": "02139"
			},
			"phone":"(617) 253-2139",
			"courses":["Computer Science", "Mathematics", "Physics"]
		},
		{
			"name": "Boston College",
			"address":{
				"line1": "140 Commonwealth Drive",
				"City": "Chestnut Hill",
				"state": "<span class='highlight'>Mass</span>achusetts",
				"zip": "(617) 552-8000"
			},
			"phone":"(617) 253-8000",
			"courses":["Law", "Nursing", "Theology"]
		}
};

 

 

Equally, if I were to search for the following:

search("Business Administration")

I would get this:

{
    "search results":{
		"entities_found": 1,
		"highlights": 1
    },
	"colleges":[
		{
			"name": "UC Berkeley College",
			"address":{
				"line1": "320 Mclaughlin Hall",
				"City": "Berkeley",
				"state": "California",
				"zip": "92139"
			},
			"phone":"(510) 642-5771",
			"courses":["Computer Science", "<span class='highlight'>Business Administration</span>", "Theology"]
		}
};

 

6 answers to this question

Recommended Posts

  • 0

So.. you're asking people to do your homework/job for you? This is super easy, just figure it out... or maybe don't study/work in programming if you're gonna ask somebody else to do it... I would have helped you if you at least shown you did part of the solution...

Edited by PmRd
  • 0

It's for a personal project. I've been working on it all weekend and I'm not overly impressed with what I've done so I'm kind of embarrassed to show my work. Example with what I've written it doesn't show variation so you type Massachusetts incorrectly for example in your search query it doesn't come up. So it's somewhat works. The area that I'm having particular problem with is restricting what key values to search for. For example if I wanted the user just to search for city name, zip code and courses, I don't know how to restrict it.  Is what I have, don't laugh!

 

var institutes = {
	"colleges":[
		{
			"name": "Massachusetts Institute of Technology",
			"address":{
				"line1": "77 Massachusetts Ave",
				"City": "Cambridge",
				"state": "Massachusetts",
				"zip": "02139"
			},
			"phone":"(617) 253-2139",
			"courses":["Computer Science", "Mathematics", "Physics"]
		},
		{
			"name": "Boston College",
			"address":{
				"line1": "140 Commonwealth Drive",
				"City": "Chestnut Hill",
				"state": "Massachusetts",
				"zip": "(617) 552-8000"
			},
			"phone":"(617) 253-8000",
			"courses":["Law", "Nursing", "Theology"]
		},
		{
			"name": "UC Berkeley College",
			"address":{
				"line1": "320 Mclaughlin Hall",
				"City": "Berkeley",
				"state": "California",
				"zip": "92139"
			},
			"phone":"(510) 642-5771",
			"courses":["Computer Science", "Business Administration", "Theology"]
		}
	]
};


function search(query) {
    let results = {
        "search results": {
            "entities_found": 0,
            "highlights": 0
        },
        "colleges": []
    };

    institutes.colleges.forEach(college => {
        let collegeCopy = JSON.parse(JSON.stringify(college));
        let highlights = recursiveSearch(collegeCopy, query);
        
        if (highlights > 0) {
            results['search results'].entities_found++;
            results['search results'].highlights += highlights;
            results.colleges.push(collegeCopy);
        }
    });

    return results;
}


function recursiveSearch(obj, query) {
    let totalHighlights = 0;

    for (let key in obj) {
        if (typeof obj[key] === 'string' && obj[key].toLowerCase().includes(query.toLowerCase())) {
            obj[key] = highlight(obj[key], query);
            totalHighlights++;
        } else if (typeof obj[key] === 'object') {
            totalHighlights += recursiveSearch(obj[key], query);
        }
    }

    return totalHighlights;
}


function highlight(text, query) {
    const regex = new RegExp(`(${query})`, 'ig');
    return text.replace(regex, "<span class='highlight'>$1</span>");
}


console.log(search("mass"));

 

  • 0

I've updated the code and I'm now utilizing the Levenstein distance algorithm attempt to perform some kind of fuzzy logic on typos. It's still not perfect, as you can see in the last example. I truly appreciate your help.

 

var institutes = {
    "colleges": [
        {
            "name": "Massachusetts Institute of Technology",
            "address": {
                "line1": "77 Massachusetts Ave",
                "city": "Cambridge",
                "state": "Massachusetts",
                "zip": "02139"
            },
            "phone": "(617) 253-2139",
            "courses": ["Computer Science", "Mathematics", "Physics"]
        },
        {
            "name": "Boston College",
            "address": {
                "line1": "140 Commonwealth Drive",
                "city": "Chestnut Hill",
                "state": "Massachusetts",
                "zip": "(617) 552-8000"
            },
            "phone": "(617) 253-8000",
            "courses": ["Law", "Nursing", "Theology"]
        },
        {
            "name": "UC Berkeley College",
            "address": {
                "line1": "320 Mclaughlin Hall",
                "city": "Berkeley",
                "state": "California",
                "zip": "92139"
            },
            "phone": "(510) 642-5771",
            "courses": ["Computer Science", "Business Administration", "Theology"]
        }
    ]
};


function getLevenshteinDistance(a, b) {
    if (a === b) return 0;
    if (a.length === 0) return b.length;
    if (b.length === 0) return a.length;

    const matrix = [];

    for (let i = 0; i <= b.length; i++) {
        matrix[i] = [i];
    }

    for (let j = 0; j <= a.length; j++) {
        matrix[0][j] = j;
    }

    for (let i = 1; i <= b.length; i++) {
        for (let j = 1; j <= a.length; j++) {
            if (b.charAt(i - 1) === a.charAt(j - 1)) {
                matrix[i][j] = matrix[i - 1][j - 1];
            } else {
                matrix[i][j] = Math.min(matrix[i - 1][j - 1] + 1, Math.min(matrix[i][j - 1] + 1, matrix[i - 1][j] + 1));
            }
        }
    }

    return matrix[b.length][a.length];
}

function isSimilar(str1, str2) {
    const threshold = 2;
    const distance = getLevenshteinDistance(str1.toLowerCase(), str2.toLowerCase());
    return distance <= threshold || str1.toLowerCase().includes(str2.toLowerCase());
}


function processCollege(college, query, searchKeys) {
    let matched = false;

    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && college[key].toLowerCase().includes(query.toLowerCase())) {
            matched = true;
            college[key] = college[key].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
        } else if (key === "courses" && Array.isArray(college[key])) {
            const matchedCourses = college[key].filter(course => course.toLowerCase().includes(query.toLowerCase()));
            if (matchedCourses.length > 0) {
                matched = true;
                college[key] = matchedCourses.map(course => course.replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`));
            }
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && college[key][subKey].toLowerCase().includes(query.toLowerCase())) {
                    matched = true;
                    college[key][subKey] = college[key][subKey].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
                }
            }
        }
    }

    return matched ? college : null;
}


function fuzzySearch(str, query) {
    const threshold = Math.ceil(query.length / 2);
    const normalizedStr = str.toLowerCase();
    const normalizedQuery = query.toLowerCase();

    return normalizedStr.includes(normalizedQuery) || levenshtein(normalizedStr, normalizedQuery) <= threshold;
}


function highlightMatches(str, query) {
    const regex = new RegExp(query, 'gi');
    return str.replace(regex, match => `<span class='highlight'>${match}</span>`);
}


function search(query, searchKeys) {
    let results = [];

    // Attempt to find exact matches...
    for (let college of institutes.colleges) {
        const processedCollege = processCollege({ ...college }, query, searchKeys);
        if (processedCollege) {
            results.push(processedCollege);
        }
    }

    // If no exact matches were found, try fuzzy search...
    if (results.length === 0) {
        for (let college of institutes.colleges) {
            const processedCollege = processCollegeFuzzy({ ...college }, query, searchKeys);
            if (processedCollege) {
                results.push(processedCollege);
            }
        }
    }

    return results;
}


function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}


function isFuzzyMatch(str1, str2, threshold) {
    return getLevenshteinDistance(str1.toLowerCase(), str2.toLowerCase()) <= threshold;
}



function processCollegeWithLevenshtein(college, query, searchKeys) {
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isSimilarLevenshtein(college[key], query)) {
            college[key] = college[key].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isSimilarLevenshtein(college[key][subKey], query)) {
                    college[key][subKey] = college[key][subKey].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
                    return college;
                }
            }
        }
    }
    return null;
}


// SUCCESS: Finds "Berkeley"
console.log(search("berkly", ["name", "city", "state", "zip", "courses"]));

// SUCCESS: Finds all Massachusetts colleges...
console.log(search("mass", ["name", "city", "state", "zip", "courses"]));

// FAIL: Deliberate misspelling of the text "Theology" and did not give any results..
console.log(search("thiology", ["name", "city", "state", "zip", "courses"]));

 

  • 0

Not going to comment on the overall structure of the code but the reason the thiology search isn't working in your example is because you're not searching arrays. 

In your:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

It does loop through the array (because an array is an object in js), but it doesn't check the values because searchKeys doesn't contain the array indices, i.e. 0, 1, 2, 3, ...

As a quick workaround without changing too much you could just check if the key is an array too:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if ((searchKeys.includes(subKey) || Array.isArray(college[key])) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
            console.log(subKey);
                
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

 

Edited by ZakO
  • Like 1
  • 0
On 14/11/2023 at 03:38, ZakO said:

Not going to comment on the overall structure of the code but the reason the thiology search isn't working in your example is because you're not searching arrays. 

In your:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

It does loop through the array (because an array is an object in js), but it doesn't check the values because searchKeys doesn't contain the array indices, i.e. 0, 1, 2, 3, ...

As a quick workaround without changing too much you could just check if the key is an array too:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if ((searchKeys.includes(subKey) || Array.isArray(college[key])) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
            console.log(subKey);
                
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

 

Thank you. I am new to JavaScritp, I've nor been doing it too long and I'm still learning. Thank you for guiding me

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • AMAZON needs to take total accountability for this.
    • Server Summit had a heap of announcements, ADCS changes are baller.
    • Nice, hope they *finally* fixed the issue with the NTFS driver where the system would completely brick during large file copies using the built in driver. It's been broken for years requiring me to use the older, slower, NTFS-3G FUSE driver.
    • Windows 11 KB5094126 BSODing, freezing, forcing BitLocker lockout, breaks OneDrive, and more by Sayan Sen Microsoft released Windows 11 KB5094126 and KB5093998 last week as the latest Patch Tuesday updates. Following that the company also published the accompanying dynamic updates under KB5094149, KB5095971, and KB5094156. While Microsoft has so far not acknowledged any major problems with the release, some users online are running into problems. These range from OneDrive and Dropbox access issues, BitLocker recovery lockouts, to blue screens and BSODs. The most common one seems to be happening with HP systems wherein affected users say they hit 0xc0430001 BSOD (blue screen of death) error code after the KB5094126 update. We wonder if this could be related to the recent bug we covered on HP devices wherein the ongoing Secure Boot certificate updates are leading to similar issues. While we are not certain, users affected by this issue likely need to ensure that the boot.stl file is included on the installation media (such as a USB installer or ISO), if the above-mentioned dynamic updates are deployed. If this file is missing, computers may fail to boot from the installation media and could display the error 0xc0430001. This STL file is used by Secure Boot to verify that the boot files are trusted, so it must match the same Windows version and system architecture. To ensure the file is included, Microsoft recommends using the Update WinPE script, which automatically updates the image and handles the required files. Alternatively, you can manually copy the boot.stl file from the Windows\Boot\EFI folder on a Windows device and place it in the matching folder on your installation media before deploying the updated image. Aside from blue screening some users also note their systems have been freezing following the update. This could be happening to Lenovo PCs specifically. In the case of the OneDrive and Dropbox access issues, a user figured out that there could be a conflict with UAC. He explained: "Okay, so I did some digging, and in our environment KB5094126 breaks OneDrive and Dropbox in Explorer. I went through all our GPOs and found out that the combination of disabling UAC and having my user being a local admin breaks OneDrive in Explorer. ... If I enable UAC again, then it works, even with KB5094126 still installed." Hopefully, Microsoft will look into these issues. Source: Microsoft forum (link1, link2, link3, link4), Reddit (link1, link2, link3, link4)
    • It is when it's a desktop in my house though for a PC that's lightly used and not really important when it is. If it was a laptop, it would be a different story. The real solution is varied and begins starting at post #22 in that thread.
  • Recent Achievements

    • Week One Done
      Jeroen Wilms earned a badge
      Week One Done
    • Week One Done
      rolfus earned a badge
      Week One Done
    • One Month Later
      Leroy Jethro Gibbs earned a badge
      One Month Later
    • Conversation Starter
      flexorcist earned a badge
      Conversation Starter
    • One Month Later
      AndreaB earned a badge
      One Month Later
  • Popular Contributors

    1. 1
      +primortal
      508
    2. 2
      +Edouard
      197
    3. 3
      PsYcHoKiLLa
      138
    4. 4
      ATLien_0
      90
    5. 5
      Steven P.
      80
  • Tell a friend

    Love Neowin? Tell a friend!