• 0

asyncio web scraping: fetching multiple urls with aiohttp - doable?


Question

hello dear all,

 

i am fairly new to bs4for that matter, but im trying to scrape a little chunk of information from a site:

but it keeps printing "None" as if the title, or any tag if i replace it, doesn't exists.

 

The project consits of two parts:

 

the looping-part: (which seems to be pretty straightforward). the parser-part: where i have some issues - see below. I'm trying to loop through an array of URLs and scrape the data below from a list of wordpress-plugins. See my loop below-

 

 

from bs4 import BeautifulSoup

import requests

#array of URLs to loop through, will be larger once I get the loop working correctly

plugins = ['https://wordpress.org/plugins/wp-job-manager', 'https://wordpress.org/plugins/ninja-forms']

 

 

this can be done like so

 




ttt = page_soup.find("div", {"class":"plugin-meta"})
text_nodes = [node.text.strip() for node in ttt.ul.findChildren('li')[:-1:2]]

 

the Output of text_nodes:

 

['Version: 1.9.5.12', 'Active installations: 10,000+', 'Tested up to: 5.6 ']

 

but if we want to fetch the data of all the wordpress-plugins and subesquently sort them to show the -let us say - latest 50 updated plugins. This would be a intereting task

 

- first of all we need to fetch the urls

- then we fetch the iformation and have to sort out the _newest_

 

0 answers to this question

Recommended Posts

There have been no answers to this question yet

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • low latency mode is still bugged and causing bootup times slow to a crawl. To fix, you have to disable the feature with vivetool. Seems as though it's not rolled out to a lot of people yet since I've only been able to find only a handful of people that are having issues.
    • I would recommend the Nothing 2a. The battery life is awesome, 2 or 3 days without going into battery power mode. The only thing that I've been looking into recently is that it doesn't "support" Graphene OS. I'm pretty sure there is a way, I just need to do some more looking.
    • You'd have to show me an example of a listing that says Gen 1, usually i'd expect that to mean Snapdragon Gen 1 (a type of chipset, which the Pixels don't use). Pixel 7 - White - 128gb - Unlocked - 85%+ battery - Grade B+ - $159 with free delivery - https://www.ebay.com/itm/398046617206 Pixel 7 - Obsidian - 128gb - Unlocked - 80%+ battery - Very Good - $157 with free delivery - https://www.ebay.com/itm/355617734563 Both look to be sold by companies with good feedback, dealing with refurbished phones and state the phones are unlocked with a clean IMEI. Obviously I can't vouch for either company though, but the listings look good in my opinion.
    • Because Chrome is doing it. And no one said anyone had to update immediately. That's silly. They could update every day for all I care as long as it's fast, and the next time the browser restarts, you're good. And the basic point is not to tee it up for bigger updates. As it is right now, all the windows I had open reopen anyway except inprivate.
    • Why? Does anybody actually want this? The constant need to close all browser sessions and wait for a new version to install, just so that there’s a integrated coupon manager feels like a waste of everyone’s time
  • Recent Achievements

    • Week One Done
      davidbazooked earned a badge
      Week One Done
    • One Month Later
      Jamswaz earned a badge
      One Month Later
    • Week One Done
      Jamswaz earned a badge
      Week One Done
    • Rookie
      Marzoid went up a rank
      Rookie
    • Community Regular
      coch went up a rank
      Community Regular
  • Popular Contributors

    1. 1
      +primortal
      515
    2. 2
      PsYcHoKiLLa
      185
    3. 3
      +Edouard
      160
    4. 4
      Steven P.
      83
    5. 5
      ATLien_0
      75
  • Tell a friend

    Love Neowin? Tell a friend!