Web Scraping with Python

In this workshop, you will learn how to extract data from websites using Python — a process known as web scraping. You will learn how to use various Python packages to retrieve the HTML of webpages and to extract specific content from both static and dynamic webpages.

Difficulty rating: ★★★☆ Intermediate

Who is it for?

Both research staff and research students. Users who are already familiar with the basics of Python.

Summary of the topics covered

Recap of how websites are structured using HTML
How to extract information from HTML data using the BeautifulSoup package
How to retrieve the HTML of a webpage using the requests package
The difference between static and dynamic webpages
How to scrape dynamic content using Selenium

Prerequisites

This workshop is intended for learners who already have a basic understanding of Python. In particular, you should be comfortable with:

Installing and importing packages and modules
Using lists and dictionaries
Using conditional statements (if, else, elif)
Using for loops
Calling functions, understanding parameters/arguments and return values

Also, you should have attended the Introduction to Web Scraping course first.

Frequency

3 times a year

Duration

3 hours

Next course

Wednesday 18th March (13:00 - 16:00)

Book here

Can't attend?

Help with Python

If you need help with using Python, you can contact our Research Software Group for advice. Or you can raise a ServiceDesk ticket with us, or attend one of our drop-in sessions.