• 0

vscode: first script-test after a quick installation - some minor issues


Question

hello dear fellows here at neowin, 

 

vscode: first script-test after a qucik installation - some minor issues

on a MX-Linux version 19.1 i have installed VSCodium 1.43.2

Version: 1.43.2

Commit: 0ba0ca52957102ca3527cf479571617f0de6ed50

Date: 2020-03-24T21:03:16.125Z

Electron: 7.1.11

Chrome: 78.0.3904.130

Node.js: 12.8.1

V8: 7.8.279.23-electron.0

OS: Linux x64 4.19.0-6-amd64

 

i have Python installed - unfortunatly not with venv - but globally .

to test the whole system i just run a little testscript.

btw - to setup with venv i will take care later the weekend. Now at the moment i only will test the system.

 

import requests
from bs4 import BeautifulSoup
import pandas as pd
page1 = requests.get('https://en.wikipedia.org/wiki/Peths_in_Pune').text BeautifulSoup(page1, 'lxml')
table = soup1.find('table',{'class':'wikitable sortable'})
#table
table1=""
for tr in table.find_all('tr'):
    row1=""
    for tds in tr.find_all('td'):
        row1=row1+","+tds.text
    table1=table1+row1[1:]
row1

 

see the output

 

 

 

^

SyntaxError: unexpected EOF while parsing

martin@mx:~

$ /usr/bin/python3 /home/martin/dev/python/test.py

File "/home/martin/dev/python/test.py", line 2

^

SyntaxError: unexpected EOF while parsing

martin@mx:~

$ /usr/bin/python3 /home/martin/dev/python/test.py

Traceback (most recent call last):

File "/home/martin/dev/python/test.py", line 5, in <module>

soup1 = BeautifulSoup(page1, 'lxml')

File "/usr/lib/python3/dist-packages/bs4/__init__.py", line 196, in __init__

% ",".join(features))

bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

martin@mx:~

$ /usr/bin/python3 /home/martin/dev/python/test.py

Traceback (most recent call last):

File "/home/martin/dev/python/test.py", line 5, in <module>

soup1 = BeautifulSoup(page1, 'lxml')

File "/usr/lib/python3/dist-packages/bs4/__init__.py", line 196, in __init__

% ",".join(features))

bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

martin@mx:~

$ python3 -V

Python 3.7.3

martin@mx:~

$ python3 -V

Python 3.7.3

martin@mx:~

 

 

...any idea - and thoguts regarding this error

 

Edited by tarifa

3 answers to this question

Recommended Posts

  • 0

your linux missing also relevant libraries for both bs4 lxml

pip can also install missing

missning things inside python or python3


pip install lxml
pip install beautifulsoup4


pip3 install lxml
pip3 install beautifulsoup4


an other hint about all of this is too use

from bs4 import BeautifulSoup
soup = BeautifulSoup(html_doc, 'html.parser')

print(soup.prettify())

 

vscode use some old packages?

 

i did take an look at vscode for Windows but just for see what is inbuilded inside vscode i did see 300 several different projects / programs from github

 

this also causes bugs to occur if you have so many different sources that you need to update for the entire vscode

 

you probably could have solved the problems by building vscode from the source code but other problems are surely occurring in the meantime

Edited by Christopher Andreason
add some extra info
  • 0
  On 04/04/2020 at 01:11, Christopher Andreason said:

 

Expand  

hello dear Christopher, 

 

first of all - many many thanks for the quick reply - this thread is the result of a headstart into Python with VSCode - since ATOM does not seem to be able to do all what i want to do.

So i try to setup VScode to work with Python. 

 

  Quote
your linux missing also relevant libraries for both bs4 lxml

pip can also install missing

missning things inside python or python3


pip install lxml
pip install beautifulsoup4


pip3 install lxml
pip3 install beautifulsoup4


an other hint about all of this is too use

from bs4 import BeautifulSoup
soup = BeautifulSoup(html_doc, 'html.parser')

print(soup.prettify())

 

vscode use some old packages?

i did take an look at vscode for Windows but just for see what is inbuilded inside vscode i did see 300 several different projects / programs from github

this also causes bugs to occur if you have so many different sources that you need to update for the entire vscode

you probably could have solved the problems by building vscode from the source code but other problems are surely occurring in the meantime

Expand  

 

Yes i guess  that you are right - 100%  - i am missing some packages  - and sure thing - VScode does not work with venv - that said - think i have to do a correct and good setup.

 

Well - Christopher i have read some and have seen a whole bunch of tutorials on the setup virtual environment for Python

 

- in VS Code - best practices for using Virtualenv 
- and besides for ATOM ...  - see above. 

i am very very glad that you have posted and give me some important hints: 

 

 

  Quote


it semms like the package which i installed and the project's environment i am currently using is different.
the package which i have  installed was in the global environment: " c:\program files\python37\lib\site-packages ", and well it seems like i looks like using the environment different(maybe the environment related to the project): This said i try ton identify this by look the right left coner shows which environment i am currently using now.

Expand  

 

So i guess that i have to using virtualenv to isolate my projects and then i store pip dependencies in requirements.txt
As we develop, we install, remove and upgrade packages, the list of dependencies in your project differs from requirements.txt

but how can i do it right - how to setup venv the right way - 
a. on win 10 and 
b. on linux ? 

i have setup a python-development-environment on a Windows 10 machine and on a MX-Linux-machine. 
i guess that i set up the machine /ATOM badly - any and all help greatly appreciated. 

 

 

To do some steps into the direction of a decent and correct installation of VSCode and the correct interaction with Python i have written down some ideas ..: 


note: the setup of Python with global mode is weird so weird. - At leaset it seems to be so - if we have a look at this thread.  

 regarding the set up and usage of virtual environment in VSCode 

I recently have read an article on using Virtual Environments for Python projects.
https://towardsdatascience.com/python-virtual-environments-made-easy-fe0c603fe601
 
and this one Comparing Python Virtual Environment tools
https://towardsdatascience.com/comparing-python-virtual-environment-tools-9a6543643a44
 
guess, that have to take care how i setup python on my linux-machine. 

Comparing installed pip packages with requirements.txt :: So if we are using virtualenv to isolate our projects and then subsequently we store pip dependencies in requirements.txt.  As we develop, we install, remove and upgrade packages, the list of dependencies in your project differs from the so called requirements.txt

Currently Installed Packages: To list what are the packages that are actually installed,  we can run

$ pip freeze

 

Compare Differences: A simple comparison of requirements.txt and pip freeze will fail because the packages are in different order. 
sort both of the output then compare them with pip

to sume up: some of the best Practices are the following: 

- Always make sure requirements.txt reflects the actual dependencies
- Pin package dependencies - use the exact version
- Preferably track only the top level dependencies in your requirements.txt
- Update dependencies periodically  - year but how
- Consider using pip-compile and pip-sync to manage your dependencies. Use pur to automatically update your top level dependencies in requirements.txt

 
I am starting to work on VS-Code using venv: In my project folder I guess that i have to create venv folder.

python -m venv venv /path/to/new/virtual/environment

but when i run in VS Code the command select python interpreter my venv folder is not shown. 

to make sure that i do all okay i try the following steps
to make my virtual interpreter in VS Code visible? i 

 

1. just go to File > preferences > Settings  - afterwards i

2. click on Workspace settings.

3. Under Files: Association, we will find Edit in settings.json , Well i click on that.
4. Update "python.pythonPath": "my_venv_path/bin/python" under workspace settings. 

(For Windows): Update "python.pythonPath": "my_venv_path/Scripts/python.exe" under workspace settings.  And subsequently

5. Restart VSCode incase if it still doesn't show the venv.


another option to show virtual environments in vs code: 

go to the parent folder in which venv is there through command prompt.
Type code . and Enter. [Working on both windows and linux for me.]

That should also show the virtual environments present in that folder.

In one workspace folder named Python need to adde all my other projects. 

to spell it out clearly: 

- I would have to have only one venv for the whole workspace folder Python. 
- i add each subfolder in Python folder as a workspace project like Project1, Project2, Project3, Project4, Project5, Project6  etc. 

In that Project folder I created venv environment and edited settings.json for workspace with this "python.venvPath": "venv" .

 

Now, for every new project I will create new workspace and inside that folder goes venv folder which will be automatically recognized.

 

+------------------------+
|                        |
|                        |
|     python-workspace   |
|     ....-folder        |
|                        |
+----------+-------------+
           |
           |
           |              +----------------------+
           |              |                      |
           +--------------+     Project1         |
           |              |                      |
           |              +----------------------+
           |
           |              +----------------------+
           |              |                      |
           +--------------+     Project2         |
           |              |                      |
           |              +----------------------+
           |
           |              +----------------------+
           |              |                      |
           +--------------+     Project3         |
           |              |                      |
           |              +----------------------+
           |
           |              +----------------------+
           |              |                      |
           +--------------+     Project4         |
           |              |                      |
           |              +----------------------+
           |
           |              +----------------------+
           |              |                      |
           +--------------+    Project5          |
           |              |                      |
           |              +----------------------+
           |
           |              +----------------------+
           |              |                      |
           +--------------+   Project6           |
                          |                      |
                          +----------------------+

 

Christropher, how do you like this idea!?

 

additional: some question regarding the practical use of VScode for running and testing scripts. 


I want to be able to export the data I have scraped as a CSV file. My question is how do I write the piece of code which outputs the data to a CSV?

 

Christopher: the question is: can we run in VSCode the code below - and have a closer look at the output? does VSCode execute the code - and store the file somewhere on the machine!?

 

 

import csv ; import requests
from bs4 import BeautifulSoup 

outfile = open('career.csv','wne='')
writer = csv.writer(outfile)
writer.writerow(["job_link", "job_desc"])

res = requests.get("http://implementconsultinggroup.com/career/#/6257").textautifulSoup(res,"lxml")
links = soup.find_all("a")

for link in links:
     if "career" in link.get("hrefd 'COPENHAGEN' in link.text:
        item_link = link.get("href").strip item_text = link.text.replace("Viewtion","").strip()
        writer.writerow([item_link, item_text])
        print(item_link, item_text)
outfile.close()

 

 

We now should be able to run this in VScode ( and yes: i do not think that we need a fully fledged IDE as PyCharm)  - i guess that we  can now open the py file and run it nicely with the shortcut Ctrl+Shift+B (Windows) or Cmd+Shift+B (Apple)


i have done a search on the net - there are quite some extensions for running python:

 

 

Official python extension: This is a must install.

Increadibly useful for all sorts of languages, not just python. Would highly reccomend installing.


AREPL: Real-time python scratchpad that displays your variables in a side window.  I'm the creator of this so obviously I think it's great but I can't give a unbiased opinion

Wolf: Real-time python scratchpad that displays results inline

 

 

And -if we use the integrated terminal we can run python in there and not have to install any extensions.

 

  [1]: https://marketplace.visualstudio.com/items?itemName=ms-python.python
  [2]: https://marketplace.visualstudio.com/items?itemName=formulahendry.code-runner
  [3]: https://marketplace.visualstudio.com/items?itemName=almenon.arepl
  [4]: https://marketplace.visualstudio.com/items?itemName=traBpUkciP.wolf
  [5]: https://marketplace.visualstudio.com/items?itemName=donjayamanne.jupyter
 

 

dear Christopher - many many thanks for your reply  - and for the idea sharing. 

 

i am very glad to be here - and to be able to share ideas in this thread.

 

have a great day

 

regards 

tarifa

Edited by tarifa
  • 0

 

hello dear Christopher 

 

i reworked the things in the VSCode - runned the following code ... :  (note: with all the necessary plugins and extensions loaded in python) and got back the following 

in the terminal - see below... 


import requests
from bs4 import BeautifulSoup
import re
import csv
from tqdm import tqdm


first = "https://europa.eu/youth/volunteering/organisations_en?page={}"
second = "https://europa.eu/youth/volunteering/organisation/{}_en"


def catch(url):
    with requests.Session() as req:
        pages = []
        print("Loading All IDS\n")
        for item in tqdm(range(0, 347)):
            r = req.get(url.format(item))
            soup = BeautifulSoup(r.content, 'html.parser')
            numbers = [item.get("href").splitsplit("_")[0] for item in soup.findAll(
                "a", href=re.compile("^/youth/volunteering/organisation/"), class_="btn btn-default")]
            pages.append(numbers)
        return numbers


def parse(url):
    links = catch(first)
    with requests.Session() as req:
        with open("Data.csv", 'w', newline="", encoding="UTF-8") as f:
            writer = csv.writer(f)
            writer.writerow(["Name", "Address", "Site", "Phone",
                             "Description", "Scope", "Rec", "Send", "PIC", "OID", "Topic"])
            print("\nParsing Now... \n")
            for link in tqdm(links):
                r = req.get(url.format(link))
                soup = BeautifulSoup(r.content, 'html.parser')
                task = soup.find("sectionass_="col-sm-12").contents
                name = task[1].text
                add = task[3].find(
                    "i", class_="fa fa-location-arrow fa-lg").parent.text.strip()
                try:
                    site = task[3].find("a", class_="link-default").get("href")
                except:
                    site = "N/A"
                try:
                    phone = task[3].find(
                        "i", class_="fa fa-phone").next_element.strip()
                except:
                    phone = "N/A"
                desc = task[3].find(
                    "h3", class_="eyp-project-heading underline").find_next("p").text
                scope = task[3].findAll("span", class_="pull-right")[1].text
                rec = task[3].select("tbody td")[1].text
                send = task[3].select("tbody td")[-1].text
                pic = task[3].select(
                    "span.vertical-space")[0].text.split(" ")[1]
                oid = task[3].select(
                    "span.vertical-space")[-1].text.split(" ")[1]
                topic = [item.next_element.strip() for item in task[3].select(
                    "i.fa.fa-check.fa-lg")]
                writer.writerow([name, add, site, phone, desc,
                                 scope, rec, send, pic, oid, "".join(topic)])


parse(second)

 

 

see the output in the terminal - the question is: where are the results are stored!?

 

image.thumb.png.605a091d42e4cc0e2996ed6ec8f7017c.png

 

 

Should i do some more settings in VSCode !?  Do i need more plugins?

 

look forward to hear from you 

 

regards 

tarifa

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • It's cheaper to let their 3rd party sales people do more than supporting thousands in-house. Sucks for those who lost a job but these tech and entertainment companies overstaffed with the lockdown and the huge boost in demand they got when everyone was stuck at home.
    • The Xbox game bar says hi. It's been able to show you CPU, RAM, GPU and FPS for as long as I can remember.
    • WinSCP 6.5.2 by Razvan Serea  WinSCP is an open source free SFTP client, FTP client, WebDAV client and SCP client for Windows. Its main function is file transfer between a local and a remote computer. Beyond this, WinSCP offers scripting and basic file manager functionality. WinSCP features: Graphical user interface Translated into several languages Integration with Windows (drag&drop, URL, shortcut icons) U3 support All common operations with files Support for SFTP and SCP protocols over SSH-1 and SSH-2 and plain old FTP protocol Batch file scripting and command-line interface Directory synchronization in several semi or fully automatic ways Integrated text editor Support for SSH password, keyboard-interactive, public key and Kerberos (GSS) authentication Integrates with Pageant (PuTTY authentication agent) for full support of public key authentication with SSH Explorer and Commander interfaces Optionally stores session information Optionally supports portable operation using a configuration file in place of registry entries, suitable for operation from removable media WinSCP 6.5.2 changelog: Thumbnail view in file panels. Three selectable sizes of toolbar icons, showing slightly larger size by default. Switching to Segoe UI font with slightly larger size. Improvements to Synchronization checklist window, including resolving file moves and pushing synchronization to background queue. Ongoing local delete operation can be moved to a background queue. Optimized working with large local directories. Compatibility with new OneDrive WebDAV interface. Dark theme for session tabs. Improvements to S3 support, including more options to authentication and display and modification of S3 file/object tags. List of all changes. Download: WinSCP 6.5.2 | 11.6 MB (Open Source) Download: WinSCP MSI | 28.7 MB Download: Standalone Executable | 8.4 MB Link: WinSCP Home page | Screenshot Get alerted to all of our Software updates on Twitter at @NeowinSoftware
    • QOwnNotes 25.6.2 by Razvan Serea QOwnNotes is a open source (GPL) plain-text file notepad with markdown support and todo list manager for GNU/Linux, Mac OS X and Windows, that (optionally) works together with the notes application of ownCloud (or Nextcloud). So you are able to write down your thoughts with QOwnNotes and edit or search for them later from your mobile device (like with CloudNotes) or the ownCloud web-service. The notes are stored as plain text files and you can sync them with your ownCloud sync client. Of course other software, like Dropbox, Syncthing, Seafile or BitTorrent Sync can be used too. Features: the notes folder can be freely chosen (multiple note folders can be used) sub-string searching of notes is possible and search results are highlighted in the notes application can be operated with customizable keyboard shortcuts external changes of note files are watched (notes or note list are reloaded) older versions of your notes can be restored from your ownCloud server trashed notes can be restored from your ownCloud server differences between current note and externally changed note are showed in a dialog markdown highlighting of notes and a markdown preview mode notes are getting their name from the first line of the note text (just like in the ownCloud notes web-application) and the note text files are automatically renamed, if the the first line changes compatible with the notes web-application of ownCloud and mobile ownCloud notes applications compatible with ownCloud's selective sync feature by supporting an unlimited amount of note folders with the ability to choose the respective folder on your server manage your ownCloud todo lists (ownCloud tasks or Tasks Plus / Calendar Plus) or use an other CalDAV server to sync your tasks to encryption of notes (AES-256 is built in or you can use custom encryption methods like Keybase.io (encryption-keybase.qml) or PGP (encryption-pgp.qml)) dark mode theme support theming support for the markdown syntax highlighting all panels can be placed wherever you want, they can even float or stack (fully dockable) support for freedesktop theme icons, you can use QOwnNotes with your native desktop icons and with your favorite dark desktop theme support for hierarchical note tagging and note subfolders support for sharing notes on your ownCloud server portable mode for carrying QOwnNotes around on USB sticks Evernote import QOwnNotes is available in many different languages like English, German, French, Polish, Chinese, Japanese, Russian, Portuguese, Hungarian, Dutch and Spanish Changes in QOwnNotes 25.6.2: The Find action dialog is now working again (for #3294) Added more French translation (thank you, jd-develop) Download: QOwnNotes 25.6.2 | 37.3 MB (Open Source) Download: QOwnNotes for Other Operating Systems View: QOwnNotes Home Page | Screenshot Get alerted to all of our Software updates on Twitter at @NeowinSoftware
  • Recent Achievements

    • First Post
      Fuzz_c earned a badge
      First Post
    • First Post
      TIGOSS earned a badge
      First Post
    • Week One Done
      slackerzz earned a badge
      Week One Done
    • Week One Done
      vivetool earned a badge
      Week One Done
    • Reacting Well
      pnajbar earned a badge
      Reacting Well
  • Popular Contributors

    1. 1
      +primortal
      704
    2. 2
      ATLien_0
      279
    3. 3
      Michael Scrip
      209
    4. 4
      +FloatingFatMan
      197
    5. 5
      Steven P.
      130
  • Tell a friend

    Love Neowin? Tell a friend!