• 0

PHP OOP - Search Results


Question

I'm coding up a relatively simple search of a catalogue of products. Nothing amazing, but based off a simple inventory system of product/category/price/description/etc.

I have a class called "Product" that represents a product - and a class called "SearchItem" which extends Product and adds a "match" field (rating of the match against the text used to search).

Populating this class per item is easy enough, nothing amazing there...

But I want to make a class that contains all the results, which is then sortable and so on - rather than re-querying the database, etc.

I appreciate any suggestions here.

Thanks you :)

Link to comment
Share on other sites

11 answers to this question

Recommended Posts

  • 0

I assume a large enough record set so that you need pagination and only display one page at a time.

Re-sorting in the context of a paginated results display means a (very) different result-subset on the page needs to be displayed.

Sorting of the whole record-set(to determine what falls on your current page) is generally much much faster when done in the DB vs PHP.

If all you want is to sort the currently visible results, you can even do it with Javascript (jQuery Sortable plugin I think)

Generally, while it may be possible that you need to be sorting in PHP, if you want your app to be fast enough while your record set grows with the years is to avoid sorting the whole set in PHP.

If you do insist you want to sort with PHP, PHP's function usort() provides the means to sort an array of elements by using a custom(your own) function to compare two elements and help the algorithm with the understanding of which one is greater

I hope this helps

Link to comment
Share on other sites

  • 0

And then only populate my product class as needed, rather than an array of "products"?

My database is over 500,000 records, and a typical search may bring back a few thousand if not refined further. This is returning in 63 milliseconds based upon a fulltext search.

I _ASSUMED_ that it made more sense to cache these results in an object and to paginate, sort and display from said object.

Am I wrong? If I go to page 2 of results, or click to sort by price - am I better to go back to the database and repopulate my resultset?

Link to comment
Share on other sites

  • 0

As many people seem to try a search and if its not on the first page search for something else a good idea would also be to remove poor results as ratings are classified... eg, when you check the rating.... if the rating is below a certain level drop it (so it then doesn't have to be sorted) rather than keeping the whole catalogue in the sorting. (eg. check rating... if <8 drop if >8 apply and place in data set). This way when you come to sort your data set is already shrunk considerably.

As you get more items you could increase the rating check further.

Also you are best to only be sorting the ratings (or whatever is being sorted) and an ID, then you should re query the database to pull the results for 1-10 or 11-20 etc. - that way you aren't sorting any more info than what is needed.

Link to comment
Share on other sites

  • 0

Thank you - I take that on board, but I want to be able to offer the option of paginating through the results. Also only sorting on match and then going into the DB to lookup further is not an option, moreover would offer me no benefits (not performance wise) in this instance.

So, I reiterate the question:

And then only populate my product class as needed, rather than an array of "products"?

My database is over 500,000 records, and a typical search may bring back a few thousand if not refined further. This is returning in 63 milliseconds based upon a fulltext search.

I _ASSUMED_ that it made more sense to cache these results in an object and to paginate, sort and display from said object.

Am I wrong? If I go to page 2 of results, or click to sort by price - am I better to go back to the database and repopulate my resultset?

Link to comment
Share on other sites

  • 0

Okay starting to understand more and more what you are trying to do now.

IMO:

If you have things like "price" you should cache the order of the entire database with the IDs.

Then on your results set when price sorting is selected you take your results IDs - you pull all the results IDs from the price cache (will already be in order) and then fetch the rest of the data (which can also be cached)

Link to comment
Share on other sites

  • 0

So you basically want to store(cache) the full result-set in a session variable, which you use during subsequent pagination requests to avoid re-query-ing altogether.

What about the freshness ?

What about 1000 people simultaneously browsing your home page, for example, and seeing the very same list of latest news, for example.. You'd be caching the same thing 1000 times in the space of PHP, instead of letting MySQL query cache respond to those identical queries from it's own single cache item and take care of expirations, etc for you..

Then again if you're dealing with too unique and DB tormenting keyword searches, it may make sense to use some post-db caching. You then may use too much session storage space to cache massive sets, on a per session basis, completely independently..

Such scenarios I think are usually solved by getting a MUCH faster fulltext searching engine like Sphinx or Lucene, if you insist on precise pagination OR staying precise for 5 pages and then estimating total number of pages (Google, etc..)

The options you may have are heavily dependent on what your hardware/OS/server software situation is and what business rules you want to maintain...

Link to comment
Share on other sites

  • 0

I think I'm convinced - I will write a search class that takes such parameters as "search by" and "pagination" as inputs, and let SQL do what SQL does - rather than putting undue strain on the webserver.

Shame though - as I had seen a system letting me query everything to get my "your search returned x records", as well as the next 15 results.

Link to comment
Share on other sites

  • 0

P.S. What do you want to happen when you're on page 3 and you click sort by price ? Do you only want to sort the currently visible results (then some $_SESSION cache could be useful, indeed) or get on page 3 of the whole set sorted by price(DB much faster getting this data, with LIMIT, etc) ? or Going to page 1 of the whole result-set sorted by price (again, DB much faster)

Seems staying on the same page number, but sorting by a different field ($_SESSION cache) is very rarely used as it's not very intuitive to the user...

Link to comment
Share on other sites

  • 0
P.S. What do you want to happen when you're on page 3 and you click sort by price ? Do you only want to sort the currently visible results (then some $_SESSION cache could be useful, indeed) or get on page 3 of the whole set sorted by price(DB much faster getting this data, with LIMIT, etc) ? or Going to page 1 of the whole result-set sorted by price (again, DB much faster)

Seems staying on the same page number, but sorting by a different field ($_SESSION cache) is very rarely used as it's not very intuitive to the user...

Interesting question, that I hadn't considered just yet.

My answer should be the option that suits the user most - so I would addume sort by said field then return to the top of the list.

Link to comment
Share on other sites

  • 0

Hehe, sorry to bump the thread, but your situation got me thinking ;-)

In theory, if you feel like caching and your records are lean, you could cache in APC with some sensible Time To Live for freshness control.

This way you re-use the cached data between visitors' sessions, minimize the PHP-talks-to-DB overhead, generally pretty efficient.

The question is - what to cache in the case of fulltext search (relatively unique, so wil be piling up cache entries quickly, APC has a RAM limit..)

The more I think - the more I fall back to the DB doing all the work

1 query to get the COUNT(id) for the results, cached by the DB so blazingly fast the next times, until the table changes

1 query to use LIMIT X,Y and get your subset for the page (most efficient way as it's totally flexible about sorting, pagination, and very fast machine level code) cached by DB for the (possible) scenaio of a reuse

Cheers ;-)

Link to comment
Share on other sites

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.