• 0

[JS] Searching Thought Local JSON File


Question

Imagine I had a local JSON file that looked like this:

var institutes = {
	"colleges":[
		{
			"name": "Massachusetts Institute of Technology",
			"address":{
				"line1": "77 Massachusetts Ave",
				"City": "Cambridge",
				"state": "Massachusetts",
				"zip": "02139"
			},
			"phone":"(617) 253-2139",
			"courses":["Computer Science", "Mathematics", "Physics"]
		},
		{
			"name": "Boston College",
			"address":{
				"line1": "140 Commonwealth Drive",
				"City": "Chestnut Hill",
				"state": "Massachusetts",
				"zip": "(617) 552-8000"
			},
			"phone":"(617) 253-8000",
			"courses":["Law", "Nursing", "Theology"]
		},
		{
			"name": "UC Berkeley College",
			"address":{
				"line1": "320 Mclaughlin Hall",
				"City": "Berkeley",
				"state": "California",
				"zip": "92139"
			},
			"phone":"(510) 642-5771",
			"courses":["Computer Science", "Business Administration", "Theology"]
		}
	]
};

 

 

How would I write a search function that could potentially traverse through the entirety, looking at various keys to grab a query perimeter and return the most relevant objects using JavaScript only? For example, if my function were called search() and I perform the following call:

search("mass");

It would return me something like the following:

{
    "search results":{
		"entities_found": 2,
		"highlights": 4
    },
	"colleges":[
		{
			"name": "<span class='highlight'>Mass</span>achusetts Institute of Technology",
			"address":{
				"line1": "77 <span class='highlight'>Massa</span>chusetts Ave",
				"City": "Cambridge",
				"state": "<span class='highlight'>Mass</span>achusetts",
				"zip": "02139"
			},
			"phone":"(617) 253-2139",
			"courses":["Computer Science", "Mathematics", "Physics"]
		},
		{
			"name": "Boston College",
			"address":{
				"line1": "140 Commonwealth Drive",
				"City": "Chestnut Hill",
				"state": "<span class='highlight'>Mass</span>achusetts",
				"zip": "(617) 552-8000"
			},
			"phone":"(617) 253-8000",
			"courses":["Law", "Nursing", "Theology"]
		}
};

 

 

Equally, if I were to search for the following:

search("Business Administration")

I would get this:

{
    "search results":{
		"entities_found": 1,
		"highlights": 1
    },
	"colleges":[
		{
			"name": "UC Berkeley College",
			"address":{
				"line1": "320 Mclaughlin Hall",
				"City": "Berkeley",
				"state": "California",
				"zip": "92139"
			},
			"phone":"(510) 642-5771",
			"courses":["Computer Science", "<span class='highlight'>Business Administration</span>", "Theology"]
		}
};

 

6 answers to this question

Recommended Posts

  • 0

So.. you're asking people to do your homework/job for you? This is super easy, just figure it out... or maybe don't study/work in programming if you're gonna ask somebody else to do it... I would have helped you if you at least shown you did part of the solution...

Edited by PmRd
  • 0

It's for a personal project. I've been working on it all weekend and I'm not overly impressed with what I've done so I'm kind of embarrassed to show my work. Example with what I've written it doesn't show variation so you type Massachusetts incorrectly for example in your search query it doesn't come up. So it's somewhat works. The area that I'm having particular problem with is restricting what key values to search for. For example if I wanted the user just to search for city name, zip code and courses, I don't know how to restrict it.  Is what I have, don't laugh!

 

var institutes = {
	"colleges":[
		{
			"name": "Massachusetts Institute of Technology",
			"address":{
				"line1": "77 Massachusetts Ave",
				"City": "Cambridge",
				"state": "Massachusetts",
				"zip": "02139"
			},
			"phone":"(617) 253-2139",
			"courses":["Computer Science", "Mathematics", "Physics"]
		},
		{
			"name": "Boston College",
			"address":{
				"line1": "140 Commonwealth Drive",
				"City": "Chestnut Hill",
				"state": "Massachusetts",
				"zip": "(617) 552-8000"
			},
			"phone":"(617) 253-8000",
			"courses":["Law", "Nursing", "Theology"]
		},
		{
			"name": "UC Berkeley College",
			"address":{
				"line1": "320 Mclaughlin Hall",
				"City": "Berkeley",
				"state": "California",
				"zip": "92139"
			},
			"phone":"(510) 642-5771",
			"courses":["Computer Science", "Business Administration", "Theology"]
		}
	]
};


function search(query) {
    let results = {
        "search results": {
            "entities_found": 0,
            "highlights": 0
        },
        "colleges": []
    };

    institutes.colleges.forEach(college => {
        let collegeCopy = JSON.parse(JSON.stringify(college));
        let highlights = recursiveSearch(collegeCopy, query);
        
        if (highlights > 0) {
            results['search results'].entities_found++;
            results['search results'].highlights += highlights;
            results.colleges.push(collegeCopy);
        }
    });

    return results;
}


function recursiveSearch(obj, query) {
    let totalHighlights = 0;

    for (let key in obj) {
        if (typeof obj[key] === 'string' && obj[key].toLowerCase().includes(query.toLowerCase())) {
            obj[key] = highlight(obj[key], query);
            totalHighlights++;
        } else if (typeof obj[key] === 'object') {
            totalHighlights += recursiveSearch(obj[key], query);
        }
    }

    return totalHighlights;
}


function highlight(text, query) {
    const regex = new RegExp(`(${query})`, 'ig');
    return text.replace(regex, "<span class='highlight'>$1</span>");
}


console.log(search("mass"));

 

  • 0

I've updated the code and I'm now utilizing the Levenstein distance algorithm attempt to perform some kind of fuzzy logic on typos. It's still not perfect, as you can see in the last example. I truly appreciate your help.

 

var institutes = {
    "colleges": [
        {
            "name": "Massachusetts Institute of Technology",
            "address": {
                "line1": "77 Massachusetts Ave",
                "city": "Cambridge",
                "state": "Massachusetts",
                "zip": "02139"
            },
            "phone": "(617) 253-2139",
            "courses": ["Computer Science", "Mathematics", "Physics"]
        },
        {
            "name": "Boston College",
            "address": {
                "line1": "140 Commonwealth Drive",
                "city": "Chestnut Hill",
                "state": "Massachusetts",
                "zip": "(617) 552-8000"
            },
            "phone": "(617) 253-8000",
            "courses": ["Law", "Nursing", "Theology"]
        },
        {
            "name": "UC Berkeley College",
            "address": {
                "line1": "320 Mclaughlin Hall",
                "city": "Berkeley",
                "state": "California",
                "zip": "92139"
            },
            "phone": "(510) 642-5771",
            "courses": ["Computer Science", "Business Administration", "Theology"]
        }
    ]
};


function getLevenshteinDistance(a, b) {
    if (a === b) return 0;
    if (a.length === 0) return b.length;
    if (b.length === 0) return a.length;

    const matrix = [];

    for (let i = 0; i <= b.length; i++) {
        matrix[i] = [i];
    }

    for (let j = 0; j <= a.length; j++) {
        matrix[0][j] = j;
    }

    for (let i = 1; i <= b.length; i++) {
        for (let j = 1; j <= a.length; j++) {
            if (b.charAt(i - 1) === a.charAt(j - 1)) {
                matrix[i][j] = matrix[i - 1][j - 1];
            } else {
                matrix[i][j] = Math.min(matrix[i - 1][j - 1] + 1, Math.min(matrix[i][j - 1] + 1, matrix[i - 1][j] + 1));
            }
        }
    }

    return matrix[b.length][a.length];
}

function isSimilar(str1, str2) {
    const threshold = 2;
    const distance = getLevenshteinDistance(str1.toLowerCase(), str2.toLowerCase());
    return distance <= threshold || str1.toLowerCase().includes(str2.toLowerCase());
}


function processCollege(college, query, searchKeys) {
    let matched = false;

    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && college[key].toLowerCase().includes(query.toLowerCase())) {
            matched = true;
            college[key] = college[key].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
        } else if (key === "courses" && Array.isArray(college[key])) {
            const matchedCourses = college[key].filter(course => course.toLowerCase().includes(query.toLowerCase()));
            if (matchedCourses.length > 0) {
                matched = true;
                college[key] = matchedCourses.map(course => course.replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`));
            }
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && college[key][subKey].toLowerCase().includes(query.toLowerCase())) {
                    matched = true;
                    college[key][subKey] = college[key][subKey].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
                }
            }
        }
    }

    return matched ? college : null;
}


function fuzzySearch(str, query) {
    const threshold = Math.ceil(query.length / 2);
    const normalizedStr = str.toLowerCase();
    const normalizedQuery = query.toLowerCase();

    return normalizedStr.includes(normalizedQuery) || levenshtein(normalizedStr, normalizedQuery) <= threshold;
}


function highlightMatches(str, query) {
    const regex = new RegExp(query, 'gi');
    return str.replace(regex, match => `<span class='highlight'>${match}</span>`);
}


function search(query, searchKeys) {
    let results = [];

    // Attempt to find exact matches...
    for (let college of institutes.colleges) {
        const processedCollege = processCollege({ ...college }, query, searchKeys);
        if (processedCollege) {
            results.push(processedCollege);
        }
    }

    // If no exact matches were found, try fuzzy search...
    if (results.length === 0) {
        for (let college of institutes.colleges) {
            const processedCollege = processCollegeFuzzy({ ...college }, query, searchKeys);
            if (processedCollege) {
                results.push(processedCollege);
            }
        }
    }

    return results;
}


function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}


function isFuzzyMatch(str1, str2, threshold) {
    return getLevenshteinDistance(str1.toLowerCase(), str2.toLowerCase()) <= threshold;
}



function processCollegeWithLevenshtein(college, query, searchKeys) {
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isSimilarLevenshtein(college[key], query)) {
            college[key] = college[key].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isSimilarLevenshtein(college[key][subKey], query)) {
                    college[key][subKey] = college[key][subKey].replace(new RegExp(query, "gi"), match => `<span class='highlight'>${match}</span>`);
                    return college;
                }
            }
        }
    }
    return null;
}


// SUCCESS: Finds "Berkeley"
console.log(search("berkly", ["name", "city", "state", "zip", "courses"]));

// SUCCESS: Finds all Massachusetts colleges...
console.log(search("mass", ["name", "city", "state", "zip", "courses"]));

// FAIL: Deliberate misspelling of the text "Theology" and did not give any results..
console.log(search("thiology", ["name", "city", "state", "zip", "courses"]));

 

  • 0

Not going to comment on the overall structure of the code but the reason the thiology search isn't working in your example is because you're not searching arrays. 

In your:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

It does loop through the array (because an array is an object in js), but it doesn't check the values because searchKeys doesn't contain the array indices, i.e. 0, 1, 2, 3, ...

As a quick workaround without changing too much you could just check if the key is an array too:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if ((searchKeys.includes(subKey) || Array.isArray(college[key])) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
            console.log(subKey);
                
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

 

Edited by ZakO
  • Like 1
  • 0
  On 14/11/2023 at 11:38, ZakO said:

Not going to comment on the overall structure of the code but the reason the thiology search isn't working in your example is because you're not searching arrays. 

In your:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if (searchKeys.includes(subKey) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

It does loop through the array (because an array is an object in js), but it doesn't check the values because searchKeys doesn't contain the array indices, i.e. 0, 1, 2, 3, ...

As a quick workaround without changing too much you could just check if the key is an array too:

function processCollegeFuzzy(college, query, searchKeys) {
    const threshold = 2;
    for (const key in college) {
        if (searchKeys.includes(key) && typeof college[key] === "string" && isFuzzyMatch(college[key], query, threshold)) {
            college[key] = highlightMatches(college[key], query);
            return college;
        } else if (typeof college[key] === "object") {
            for (const subKey in college[key]) {
                if ((searchKeys.includes(subKey) || Array.isArray(college[key])) && typeof college[key][subKey] === "string" && isFuzzyMatch(college[key][subKey], query, threshold)) {
            console.log(subKey);
                
                    college[key][subKey] = highlightMatches(college[key][subKey], query);
                    return college;
                }
            }
        }
    }
    return null;
}

 

Expand  

Thank you. I am new to JavaScritp, I've nor been doing it too long and I'm still learning. Thank you for guiding me

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • Shotcut 25.07 by Razvan Serea Shotcut is a free, open source, cross-platform video editor for Windows, Mac and Linux. Major features include support for a wide range of formats; no import required meaning native timeline editing; Blackmagic Design support for input and preview monitoring; and resolution support to 4k. Editing Features Trimming on source clip player or timeline with ripple option Append, insert, overwrite, lift, and ripple delete editing on the timeline 3-point editing Hide, mute, and lock track controls Multitrack timeline with thumbnails and waveforms Unlimited undo and redo for playlist edits including a history view Create, play, edit, save, load, encode, and stream MLT XML projects (with auto-save) Save and load trimmed clip as MLT XML file Load and play complex MLT XML file as a clip Drag-n-drop files from file manager Scrubbing and transport control Video Effects Video compositing across video tracks HTML5 (sans audio and video) as video source and filters 3-way (shadows, mids, highlights) color wheels for color correction and grading Eye dropper tool to pick neutral color for white balancing Deinterlacing Auto-rotate Fade in/out audio and fade video from and to black with easy-to-use fader controls on timeline Video wipe transitions: bar, barn door, box, clock (radial), diagonal, iris, matrix, and custom gradient image Track compositing/blending modes: Over, Add, Saturate, Multiply, Screen, Overlay, Darken, Dodge, Burn, Hard Light, Soft Light, Difference, Exclusion, HSL Hue, HSL Saturation, HSL Color, HSL Luminosity. Video Filters: Alpha Channel: Adjust, Alpha Channel: View, Blur, Brightness, Chroma Key: Advanced, Chroma Key: Simple, Contrast, Color Grading, Crop, Diffusion, Glow, Invert Colors, Key Spill: Advanced, Key Spill: Simple, Mirror, Old Film: Dust, Old Film: Grain, Old Film: Projector, Old Film: Scratches, Old Film: Technocolor, Opacity, Rotate, Rutt-Etra-Izer, Saturation, Sepia Tone, Sharpen, Size and Position, Stabilize, Text, Vignette, Wave, White Balance Speed effect for audio/video clips Hardware Support Blackmagic Design SDI and HDMI for input and preview monitoring Leap Motion for jog/shuttle control Webcam capture Audio capture to system audio card Capture (record) SDI, HDMI, webcam (V4L2), JACK audio, PulseAudio, IP stream, X11 screen, and Windows DirectShow devices Multi-core parallel image processing (when not using GPU and frame-dropping is disabled) DeckLink SDI keyer output OpenGL GPU-based image processing with 16-bit floating point linear per color component ShotCut 25.07 changelog: Added a Whisper.cpp (GGML) model downloader to the Speech to Text dialog. A model is no longer included in the download and installation reducing their sizes. Improved the System theme to follow the operating system palette on Windows (darker and more contrast), and improved its appearance on macOS dark mode. Added Settings > Theme > System Fusion that combines the operating system palette with the monochrome, symbolic icons of the Fusion themes. Added an Outline video filter that uses the input alpha channel--useful with rich text or assets with a transparent background. This means that, like Drop Shadow, it will not work as expected when used after a text filter on a video or image clip. Rather, you must use a text clip (transparent color generator with text filter) on an upper track. Other New Features Added the ability to drag the waveform peak line to adjust audio gain. Added Settings > Timeline > Adjust Clip Gain/Volume to turn off the above. Added rolling an edit/trim to Timeline: Hold Ctrl (command on macOS) while trimming to simultaneously trim the neighbor clip. Added a Soft Focus filter set. Added Audio/Video duration to the Slideshow Generator dialog, defaults to 4 hours. This facilitates using Slideshow Generator to make transitions between everything when including both video and images. (It still respects the source duration and in & out points; duration here is a maximum.) Surround Sound Mixing Improvements Added fader and surround balance to the Balance audio filter if channels > 2. Added Channels toggle buttons to many audio filters: Band Pass Compressor Delay Downmix Equalizer: 3-Band Equalizer: 15-Band Equalizer: Parametric Expander Gain/Volume High Pass Low Pass Limiter Mute Noise Gate Notch Added support for 4 channels in the Copy Channel audio filter. For example, now you can: Copy stereo music to the rear channels and use the fader in the Balance filter to reduce its volume, Downmix spoken word into the center channel and apply a Band Pass filter to it, and Route music or sound effects to the low-frequency channels and apply a Low Pass filter to it. Other Improvements Changed the default Export > Audio > Rate control to Average Bitrate for AAC, Opus, and MP3. Added the ability to add/use multiple Mask: Apply filters. Added support for Scrub While Dragging to trimming on the timeline. Added hold Shift to ripple-trim when Ripple is turned off. Added French (Canadian) and Lithuanian translations. Fixes Fixed Mask: Apply with multiple Mask: Simple Shape (broke in v25.05) Fixed exporting projects containing only Generator clips on Windows (broke in v25.05). Fixed converting 10-bit full to limited range (broke in v25.01). Fixed dropdown menus using Settings > Theme > System on Windows. Fixed Balance and Pan audio muted channels if audio channels > 2. Fixed Export > Use hardware encoder fails with H.264 on macOS 15. Fixed Properties > Convert or Reverse for iPhone 16 Pro videos with Ambisonic audio. Fixed a single frame fade out filter would either mute or make black. Fixed repairing a project (e.g. broken file links) with proxy turned on. Fixed doing Freeze Frame on the first frame of a clip. Download: ShotCut 25.07 | Portable | ARM64 ~200.0 MB (Open Source) View: Shotcut Home Page | Other Operating Systems | Screenshot Get alerted to all of our Software updates on Twitter at @NeowinSoftware
    • Win8 still takes the crown though in terms of worst Microsoft OS which all boiled down to it's horrible interface upon release as it was actually difficult to do even basic stuff. I remember trying it in a VM early on and it was a chore doing the basics we have had for ages and I quickly dumped that and never bothered with it again. in fact, prior to Win11 it was the only OS from Microsoft I have never used/had on a real machine and I have been using windows from Win v3.11 in mid-1990's through Windows 10. basically Win v3.11, 95, 98, Me, 2k, XP, 7, Vista, 10. but putting Windows 8 aside, I would probably place Win11 next in line (lets start from WinXP to date since that's basically when Windows got good and PC's pretty much went mainstream), but at least Win11 is not in the 'horrible' category as at least it's basic interface is normal. but if I ignore the interface, Win11 is a strong candidate, not only for telemetry and the like, but forced "requirements" which make people jump through hoops we should not have to all in the name of "better security", which personally I think is not enough of a boost to justify the forced "requirements". but one area you can tell Linux is faster is installing updates and, as a bonus, we generally don't need to reboot (short of like Kernel updates but those I am in no rush as my longest every main PC system uptime without a reboot was Aug 2023 until Jan 2025). but yeah, I suspect all of the background junk Windows has running does slow things a bit. p.s. speaking of that 'CachyOS', I used one of their custom Proton's (on Lutris) to get NTSync recently as while I usually use the more typical 'GE-Proton10-10' (this and '10-9' have NTSync support which was added recently. but 10-9 you have to manually enable where as 10-10 is automatic if it's available in the kernel to the system), I have one game which does not play back in-game videos with the Proton 10 series (you can hear audio but it's basically a black screen) but works in the 9 series and 'CachyOS' had a build from Jan 2025 that has NTSync, so I used that and it worked. I had to use 'PROTON_USE_NTSYNC=1' though since it's not enabled by default (along with 'sudo modprobe ntsync', or setup a udev rule etc if you want it to work in reboots without needing to do that command). one can see NTSync is working through MangoHud if you setup 'winesync' (just add that entry to "~/.config/MangoHud/MangoHud.conf") in the configuration file for ManngoHud or if you want to directly see it in proton log I did 'PROTON_LOG=1', which then creates a log in the Home folder which, at least on GE-Proton10-10, creates 'steam-default.log' and in that shows "wineserver: NTSync up and running!" where as if you don't it will generally show "fsync: up and running."
    • Riverdale... in space? Hope not.
    • Then why didn't Apple upstream their feature?
    • Unix is in fact not open source.
  • Recent Achievements

    • Week One Done
      Lokmat Rajasthan earned a badge
      Week One Done
    • One Month Later
      TheRingmaster earned a badge
      One Month Later
    • First Post
      smileyhead earned a badge
      First Post
    • One Month Later
      K V earned a badge
      One Month Later
    • Week One Done
      K V earned a badge
      Week One Done
  • Popular Contributors

    1. 1
      +primortal
      638
    2. 2
      ATLien_0
      241
    3. 3
      Xenon
      177
    4. 4
      neufuse
      155
    5. 5
      +FloatingFatMan
      123
  • Tell a friend

    Love Neowin? Tell a friend!