• 0

[JS] Searching Thought Local JSON File


Question

Imagine I had a local JSON file that looked like this:

var institutes = {
	"colleges":[
		{
			"name": "Massachusetts Institute of Technology",
			"address":{
				"line1": "77 Massachusetts Ave",
				"City": "Cambridge",
				"state": "Massachusetts",
				"zip": "02139"
			},
			"phone":"(617) 253-2139",
			"courses":["Computer Science", "Mathematics", "Physics"]
		},
		{
			"name": "Boston College",
			"address":{
				"line1": "140 Commonwealth Drive",
				"City": "Chestnut Hill",
				"state": "Massachusetts",
				"zip": "(617) 552-8000"
			},
			"phone":"(617) 253-8000",
			"courses":["Law", "Nursing", "Theology"]
		},
		{
			"name": "UC Berkeley College",
			"address":{
				"line1": "320 Mclaughlin Hall",
				"City": "Berkeley",
				"state": "California",
				"zip": "92139"
			},
			"phone":"(510) 642-5771",
			"courses":["Computer Science", "Business Administration", "Theology"]
		}
	]
};

 

 

How would I write a search function that could potentially traverse through the entirety, looking at various keys to grab a query perimeter and return the most relevant objects using JavaScript only? For example, if my function were called search() and I perform the following call:

search("mass");

It would return me something like the following:

{
    "search results":{
		"entities_found": 2,
		"highlights": 4
    },
	"colleges":[
		{
			"name": "<span class='highlight'>Mass</span>achusetts Institute of Technology",
			"address":{
				"line1": "77 <span class='highlight'>Massa</span>chusetts Ave",
				"City": "Cambridge",
				"state": "<span class='highlight'>Mass</span>achusetts",
				"zip": "02139"
			},
			"phone":"(617) 253-2139",
			"courses":["Computer Science", "Mathematics", "Physics"]
		},
		{
			"name": "Boston College",
			"address":{
				"line1": "140 Commonwealth Drive",
				"City": "Chestnut Hill",
				"state": "<span class='highlight'>Mass</span>achusetts",
				"zip": "(617) 552-8000"
			},
			"phone":"(617) 253-8000",
			"courses":["Law", "Nursing", "Theology"]
		}
};

 

 

Equally, if I were to search for the following:

search("Business Administration")

I would get this:

{
    "search results":{
		"entities_found": 1,
		"highlights": 1
    },
	"colleges":[
		{
			"name": "UC Berkeley College",
			"address":{
				"line1": "320 Mclaughlin Hall",
				"City": "Berkeley",
				"state": "California",
				"zip": "92139"
			},
			"phone":"(510) 642-5771",
			"courses":["Computer Science", "<span class='highlight'>Business Administration</span>", "Theology"]
		}
};

 

6 answers to this question

Recommended Posts

  • 0

So.. you're asking people to do your homework/job for you? This is super easy, just figure it out... or maybe don't study/work in programming if you're gonna ask somebody else to do it... I would have helped you if you at least shown you did part of the solution...

Edited by PmRd
  • 0

It's for a personal project. I've been working on it all weekend and I'm not overly impressed with what I've done so I'm kind of embarrassed to show my work. Example with what I've written it doesn't show variation so you type Massachusetts incorrectly for example in your search query it doesn't come up. So it's somewhat works. The area that I'm having particular problem with is restricting what key values to search for. For example if I wanted the user just to search for city name, zip code and courses, I don't know how to restrict it.  Is what I have, don't laugh!

 

var institutes = {
	"colleges":[
		{
			"name": "Massachusetts Institute of Technology",
			"address":{
				"line1": "77 Massachusetts Ave",
				"City": "Cambridge",
				"state": "Massachusetts",
				"zip": "02139"
			},
			"phone":"(617) 253-2139",
			"courses":["Computer Science", "Mathematics", "Physics"]
		},
		{
			"name": "Boston College",
			"address":{
				"line1": "140 Commonwealth Drive",
				"City": "Chestnut Hill",
				"state": "Massachusetts",
				"zip": "(617) 552-8000"
			},
			"phone":"(617) 253-8000",
			"courses":["Law", "Nursing", "Theology"]
		},
		{
			"name": "UC Berkeley College",
			"address":{
				"line1": "320 Mclaughlin Hall",
				"City": "Berkeley",
				"state": "California",
				"zip": "92139"
			},
			"phone":"(510) 642-5771",
			"courses":["Computer Science", "Business Administration", "Theology"]
		}
	]
};


function search(query) {
    let results = {
        "search results": {
            "entities_found": 0,
            "highlights": 0
        },
        "colleges": []
    };

    institutes.colleges.forEach(college => {
        let collegeCopy = JSON.parse(JSON.stringify(college));
        let highlights = recursiveSearch(collegeCopy, query);
        
        if (highlights > 0) {
            results['search results'].entities_found++;
            results['search results'].highlights += highlights;
            results.colleges.push(collegeCopy);
        }
    });

    return results;
}


function recursiveSearch(obj, query) {
    let totalHighlights = 0;

    for (let key in obj) {
        if (typeof obj[key] === 'string' && obj[key].toLowerCase().includes(query.toLowerCase())) {
            obj[key] = highlight(obj[key], query);
            totalHighlights++;
        } else if (typeof obj[key] === 'object') {
            totalHighlights += recursiveSearch(obj[key], query);
        }
    }

    return totalHighlights;
}


function highlight(text, query) {
    const regex = new RegExp(`(${query})`, 'ig');
    return text.replace(regex, "<span class='highlight'>$1</span>");
}


console.log(search("mass"));

 

  • 0

I've updated the code and I'm now utilizing the Levenstein distance algorithm attempt to perform some kind of fuzzy logic on typos. It's still not perfect, as you can see in the last example. I truly appreciate your help.

 

var institutes = {
    "colleges": [
        {
            "name": "Massachusetts Institute of Technology",
            "address": {
                "line1": "77 Massachusetts Ave",
                "city": "Cambridge",
                "state": "Massachusetts",
                "zip": "02139"
            },
            "phone": "(617) 253-2139",
            "courses": ["Computer Science", "Mathematics", "Physics"]
        },
        {
            "name": "Boston College",
            "address": {
                "line1": "140 Commonwealth Drive",
                "city": "Chestnut Hill",
                "state": "Massachusetts",
                "zip": "(617) 552-8000"
            },
            "phone": "(617) 253-8000",
            "courses": ["Law", "Nursing", "Theology"]
        },
        {
            "name": "UC Berkeley College",
            "address": {
                "line1": "320 Mclaughlin Hall",
                "city": "Berkeley",
                "state": "California",
                "zip": "92139"
            },
            "phone": "(510) 642-5771",
            "courses": ["Computer Science", "Business Administration", "Theology"]
        }
    ]
};


function getLevenshteinDistance(a, b) {
    if (a === b) return 0;
    if (a.length === 0) return b.length;
    if (b.length === 0) return a.length;

    const matrix = [];

    for (let i = 0; i <= b.length; i++) {
        matrix[i] = [i];
    }

    for (let j = 0; j <= a.length; j++) {
        matrix[0][j] = j;
    }

    for (let i = 1; i <= b.length; i++) {
        for (let j = 1; j <= a.length; j++) {
            if (b.charAt(i - 1) === a.charAt(j - 1)) {
                matrix[i][j] = matrix[i - 1][j - 1];
            } else {
                matrix[i][j] = Math.min(matrix[i - 1][j - 1] + 1, Math.min(matrix[i][j - 1] + 1, matrix[i - 1][j] + 1));
            }
        }
    }

    return matrix[b.length][a.length];
}

function isSimilar(str1, str2) {
    const threshold = 2;
    const distance = getLevenshteinDistance(str1.toLowerCase(), str2.toLowerCase());
    return distance <= threshold || str1.toLowerCase().includes(str2.toLowerCase());
}


function processCollege(college, query, searchKeys) {
    let matched = false;

    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && college[key].toLowerCase().includes(query.toLowerCase())) {
            matched = true;
            college[key] = college[key].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
        } else if (key === "courses" && Array.isArray(college[key])) {
            const matchedCourses = college[key].filter(course => course.toLowerCase().includes(query.toLowerCase()));
            if (matchedCourses.length > 0) {
                matched = true;
                college[key] = matchedCourses.map(course => course.replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`));
            }
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && college[key][subKey].toLowerCase().includes(query.toLowerCase())) {
                    matched = true;
                    college[key][subKey] = college[key][subKey].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
                }
            }
        }
    }

    return matched ? college : null;
}


function fuzzySearch(str, query) {
    const threshold = Math.ceil(query.length / 2);
    const normalizedStr = str.toLowerCase();
    const normalizedQuery = query.toLowerCase();

    return normalizedStr.includes(normalizedQuery) || levenshtein(normalizedStr, normalizedQuery) <= threshold;
}


function highlightMatches(str, query) {
    const regex = new RegExp(query, 'gi');
    return str.replace(regex, match => `<span class='highlight'>${match}</span>`);
}


function search(query, searchKeys) {
    let results = [];

    // Attempt to find exact matches...
    for (let college of institutes.colleges) {
        const processedCollege = processCollege({ ...college }, query, searchKeys);
        if (processedCollege) {
            results.push(processedCollege);
        }
    }

    // If no exact matches were found, try fuzzy search...
    if (results.length === 0) {
        for (let college of institutes.colleges) {
            const processedCollege = processCollegeFuzzy({ ...college }, query, searchKeys);
            if (processedCollege) {
                results.push(processedCollege);
            }
        }
    }

    return results;
}


function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}


function isFuzzyMatch(str1, str2, threshold) {
    return getLevenshteinDistance(str1.toLowerCase(), str2.toLowerCase()) <= threshold;
}



function processCollegeWithLevenshtein(college, query, searchKeys) {
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isSimilarLevenshtein(college[key], query)) {
            college[key] = college[key].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isSimilarLevenshtein(college[key][subKey], query)) {
                    college[key][subKey] = college[key][subKey].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
                    return college;
                }
            }
        }
    }
    return null;
}


// SUCCESS: Finds "Berkeley"
console.log(search("berkly", ["name", "city", "state", "zip", "courses"]));

// SUCCESS: Finds all Massachusetts colleges...
console.log(search("mass", ["name", "city", "state", "zip", "courses"]));

// FAIL: Deliberate misspelling of the text "Theology" and did not give any results..
console.log(search("thiology", ["name", "city", "state", "zip", "courses"]));

 

  • 0

Not going to comment on the overall structure of the code but the reason the thiology search isn't working in your example is because you're not searching arrays. 

In your:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

It does loop through the array (because an array is an object in js), but it doesn't check the values because searchKeys doesn't contain the array indices, i.e. 0, 1, 2, 3, ...

As a quick workaround without changing too much you could just check if the key is an array too:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if ((searchKeys.includes(subKey) || Array.isArray(college[key])) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
            console.log(subKey);
                
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

 

Edited by ZakO
  • Like 1
  • 0
On 14/11/2023 at 03:38, ZakO said:

Not going to comment on the overall structure of the code but the reason the thiology search isn't working in your example is because you're not searching arrays. 

In your:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

It does loop through the array (because an array is an object in js), but it doesn't check the values because searchKeys doesn't contain the array indices, i.e. 0, 1, 2, 3, ...

As a quick workaround without changing too much you could just check if the key is an array too:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if ((searchKeys.includes(subKey) || Array.isArray(college[key])) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
            console.log(subKey);
                
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

 

Thank you. I am new to JavaScritp, I've nor been doing it too long and I'm still learning. Thank you for guiding me

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • My father still uses a programme written in dbase3. Still manages to work with a little help from dosbox. 
    • Microsoft hides these secret Windows 11 performance boost settings available on every PC by Sayan Sen Windows enthusiasts often look for ways to extract as much performance out of their systems as possible, and it's often the case that they try and do so while trying to minimize the heat and power consumption. This is especially relevant in the case of mobile Windows PCs since laptops and notebooks tend to get hot and management of that heat and power is harder in such a form factor. As such users often turn to techniques like under-volting which can be used to squeeze out the maximum capabilities of a chip while also maintaining lowered power levels. There are official apps from AMD and Intel with the likes of Ryzen Master and XTU (Extreme Tuning Utility). While these are quite handy, most enthusiasts probably prefer to dig into the BIOS and play around with settings there like Curve Optimizer on Ryzen, which lets users set various frequency-voltage scaling values. These are essentially called P-States. If you are not familiar with them, Processor Power Management is done through Advanced Configuration and Power Interface (ACPI) P-states and C-states. While P-states or performance pwoer states handle CPU voltage-frequency scaling, C-states deal with CPU sleep states so that some of the CPU functions, which are not necessary at that moment, can be disabled. The P-states and C-states work together to make the processor run more efficiently. It helps the OS and apps determine which cores can be parked and which should be boosted. Of course not every user is an enthusiast or knows the technicalities and integrities of how things like overclocking or undervolting work. Thankfully for them Windows itself offers something pretty cool, though it is hidden by default on all systems. By default, Windows only has two P-States, "Minimum Processor State" and "Maximum Processor State." However, this can be changed with a Registry trick to expand the options under a secret "Processor performance boost mode" dropdown. This essentially enables the HWP or hardware P-States available on a device, and these are not controlled just by the OS itself as the underlying hardware gets involved too. In total there are five Processor Performance Boost Mode profiles that control how Windows requests and allows CPU turbo/boost behavior under the different power policies. They are: Disabled: In this mode, processor boosting is effectively turned off. The CPU will avoid entering turbo or boost frequencies and instead operate closer to its base frequency ceiling. This can significantly reduce power consumption and heat output, but at the cost of reduced burst performance and responsiveness in short workloads. Enabled: This is the standard behavior where boost functionality is allowed under normal conditions. The processor can opportunistically increase frequency when workload demands it, balancing performance gains with power and thermal constraints as managed by the system. Aggressive: Aggressive mode favors performance more heavily, allowing the CPU to enter higher boost states more readily and sustain them longer. This should in theory improve responsiveness under bursty or heavy workloads but increases power draw and thermal output compared to the default enabled behavior. Efficient Enabled: This mode still allows boosting, but with a stronger bias toward energy efficiency. The system attempts to use boost more selectively, avoiding unnecessary frequency spikes when the performance gain is marginal. Efficient Aggressive: This is a hybrid approach where boost is still performance-responsive, but the system continuously weighs efficiency more heavily than in Aggressive mode. It aims to deliver noticeable performance improvements while reducing wasted power in less demanding scenarios. Here's how to enable the Processor performance boost mode: Open Registry Editor: Press Win+R, type regedit, and click OK. Go to: HKLM\SYSTEM\CurrentControlSet\Control\Power\PowerSettings\54533251-82be-4824-96c1-47b60b740d00\be337238-0d82-4146-a960-4f3749d470c7 (where HKLM stands for HKEY_LOCAL_MACHINE_) Modify the value of Attributes from 1 to 2 (you can find modify option by right-clicking) After that, exit Registry, you should now be able to see the new "Processor performance boost mode" dropdown menu: As you can see there are now five new P-States or CPPC states or power profile available that help define the boost mode processor setting on your PC. Wrapping it up here's a quick run-down of the settings as defined by Microsoft itself. Setting Description Disabled The corresponding P-state-based behaviour is disabled. Collaborative Processor Performance Control (CPPC) behaviour is disabled. Enabled The corresponding P-state-based behaviour is enabled. CPPC behaviour is Efficient Enabled. Aggressive The corresponding P-state-based behaviour is enabled. CPPC behaviour is Aggressive. Efficient Enabled The corresponding P-state-based behaviour is Efficient. CPPC behaviour is Efficient Enabled. Efficient Aggressive The corresponding P-state-based behaviour is Efficient. CPPC behaviour is Aggressive. Aggressive At Guaranteed Windows calculates the desired extra performance above the guaranteed performance level, and asks the processor to deliver that specific performance level. Efficient Aggressive At Guaranteed Windows always asks the processor to deliver the highest possible performance above the guaranteed performance level. In the next part we shall be comparing these settings to explore how much of a benefit or regression they can provide in terms of performance and power efficiency. If you decide to change the values on your system and are experiencing problems like crashes or an overheating PC, make sure to revert the steps back to the original state.
    • I think he means you haven't reviewed previous UFC games. Of course it doesn't matter... Every time you just report on something that involves the President even if just simply what happened you guys usually get accused of being anti-Trump. We live in fun times.
    • So how did you solve the problem? Disabling Secure Boot isn’t a solution.
  • Recent Achievements

    • One Month Later
      Leroy Jethro Gibbs earned a badge
      One Month Later
    • Conversation Starter
      flexorcist earned a badge
      Conversation Starter
    • One Month Later
      AndreaB earned a badge
      One Month Later
    • One Month Later
      agatameier earned a badge
      One Month Later
    • Week One Done
      agatameier earned a badge
      Week One Done
  • Popular Contributors

    1. 1
      +primortal
      518
    2. 2
      +Edouard
      198
    3. 3
      PsYcHoKiLLa
      147
    4. 4
      ATLien_0
      93
    5. 5
      Steven P.
      77
  • Tell a friend

    Love Neowin? Tell a friend!