asyncio web scraping: fetching multiple urls with aiohttp - doable?

Asked by tarifa,
April 8, 2020

https://www.neowin.net/forum/topic/1394106-asyncio-web-scraping-fetching-multiple-urls-with-aiohttp-doable/

Followers 0

Question

tarifa

Posted April 8, 2020

- Share

Posted April 8, 2020

hello dear all,

i am fairly new to bs4for that matter, but im trying to scrape a little chunk of information from a site:

but it keeps printing "None" as if the title, or any tag if i replace it, doesn't exists.

The project consits of two parts:

the looping-part: (which seems to be pretty straightforward). the parser-part: where i have some issues - see below. I'm trying to loop through an array of URLs and scrape the data below from a list of wordpress-plugins. See my loop below-

from bs4 import BeautifulSoup

import requests

#array of URLs to loop through, will be larger once I get the loop working correctly

plugins = ['https://wordpress.org/plugins/wp-job-manager', 'https://wordpress.org/plugins/ninja-forms']

this can be done like so




ttt = page_soup.find("div", {"class":"plugin-meta"})
text_nodes = [node.text.strip() for node in ttt.ul.findChildren('li')[:-1:2]]

the Output of text_nodes:

['Version: 1.9.5.12', 'Active installations: 10,000+', 'Tested up to: 5.6 ']

but if we want to fetch the data of all the wordpress-plugins and subesquently sort them to show the -let us say - latest 50 updated plugins. This would be a intereting task

- first of all we need to fetch the urls

- then we fetch the iformation and have to sort out the _newest_

Link to comment

https://www.neowin.net/forum/topic/1394106-asyncio-web-scraping-fetching-multiple-urls-with-aiohttp-doable/

Share on other sites

0 answers to this question

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign in

Already have an account? Sign in here.

https://www.neowin.net/forum/topic/1394106-asyncio-web-scraping-fetching-multiple-urls-with-aiohttp-doable/

Followers 0

Go to question listing

Recently Browsing 0 members
- No registered users viewing this page.

Posts
- Windows 11's big performance boost is finally available for all
  
  By duhk · Posted 18 minutes ago
  
  low latency mode is still bugged and causing bootup times slow to a crawl. To fix, you have to disable the feature with vivetool. Seems as though it's not rolled out to a lot of people yet since I've only been able to find only a handful of people that are having issues.
- Phone Dilemma
  
  By Nick H. · Posted 26 minutes ago
  
  I would recommend the Nothing 2a. The battery life is awesome, 2 or 3 days without going into battery power mode. The only thing that I've been looking into recently is that it doesn't "support" Graphene OS. I'm pretty sure there is a way, I just need to do some more looking.
- Phone Dilemma
  
  By +InsaneNutter · Posted 34 minutes ago
  
  You'd have to show me an example of a listing that says Gen 1, usually i'd expect that to mean Snapdragon Gen 1 (a type of chipset, which the Pixels don't use). Pixel 7 - White - 128gb - Unlocked - 85%+ battery - Grade B+ - $159 with free delivery - https://www.ebay.com/itm/398046617206 Pixel 7 - Obsidian - 128gb - Unlocked - 80%+ battery - Very Good - $157 with free delivery - https://www.ebay.com/itm/355617734563 Both look to be sold by companies with good feedback, dealing with refurbished phones and state the phones are unlocked with a clean IMEI. Obviously I can't vouch for either company though, but the listings look good in my opinion.
- Microsoft about to radically change how often your Edge browser updates
  
  By +mram · Posted 38 minutes ago
  
  Because Chrome is doing it. And no one said anyone had to update immediately. That's silly. They could update every day for all I care as long as it's fast, and the next time the browser restarts, you're good. And the basic point is not to tee it up for bigger updates. As it is right now, all the windows I had open reopen anyway except inprivate.
- Microsoft about to radically change how often your Edge browser updates
  
  By seacaptain · Posted 51 minutes ago
  
  Why? Does anybody actually want this? The constant need to close all browser sessions and wait for a new version to install, just so that there’s a integrated coupon manager feels like a waste of everyone’s time
Recent Achievements
- davidbazooked earned a badge
  Week One Done
  3 hours ago
- Jamswaz earned a badge
  One Month Later
  8 hours ago
- Jamswaz earned a badge
  Week One Done
  8 hours ago
- Marzoid went up a rank
  Rookie
  9 hours ago
- coch went up a rank
  Community Regular
  9 hours ago
Popular Contributors
- Week
- Month
- Year
- All Time
1. 1
  
  +primortal
  515
2. 2
  
  PsYcHoKiLLa
  185
3. 3
  
  +Edouard
  160
4. 4
  
  Steven P.
  83
5. 5
  
  ATLien_0
  75
Show More
Tell a friend
Love Neowin? Tell a friend!
- Email
- Share

Sign In

asyncio web scraping: fetching multiple urls with aiohttp - doable?

Question

tarifa

Link to comment

Share on other sites

0 answers to this question

Recommended Posts

Create an account or sign in to comment

Create an account

Sign in

Recently Browsing 0 members

Posts

Recent Achievements

Popular Contributors

Tell a friend

Company

Community

Social

Partners

Forums

News

Features

More

Themes