Files
flatscraper/README.md

77 lines
2.5 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Flatscraper - A Simple Web Scraper for Flat Listings 🔍🏠
**Flatscraper** is a lightweight web scraper that extracts flat listings from a specified URL. It leverages the [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/) library to parse HTML and capture essential details about flats—**title**, **price**, and **location**.
## 🚀 Features
- 🏠 **Scrapes flat listings** from a specified URL
- 🔍 **Extracts key details:** title, price, and location
- 💾 **Saves data** to a CSV file
- ⚙️ **Supports command-line arguments** for customization
- 🛠️ **Easy to use and modify** for different websites
- 🧪 **Includes a simple test case** for demonstration
- 📚 **Utilizes Pythons built-in libraries** along with BeautifulSoup for HTML parsing
- 🔔 **Discord Webhook integration** for notifications
## 🏢 Housing Providers
- **LWB**
- **Lipsia**
- **BGL**
- **VLW**
- **Wogetra**
## 📦 Requirements
You can run the bot natively on your machine or use a Docker image. The requirements include:
- **Python 3.6 or higher**
- **BeautifulSoup4**
- **Requests**
- **Pandas**
- **Discord Webhook** (optional, for notifications)
- **Docker** (optional, for containerization)
## 🛠️ Installation
### 1. Environment Setup
Ensure that the `.env` file is configured correctly. An example is available in the `sample.env` file. Copy it to `.env` and fill in the required values.
The `SAP_SESSIONID` and `COOKIE_SESSSION` are obtained after performing a search on the LWB website. Use your browser's developer tools to locate them in local storage.
*Future versions will include automatic form processing to obtain a valid session ID.*
### 2. Python Environment
You can use a virtual environment to install the necessary packages:
```bash
# Create a virtual environment
python -m venv venv
# Activate the virtual environment
# On Windows:
venv\Scripts\activate
# On macOS and Linux:
source venv/bin/activate
# Install the required packages
pip install -r requirements.txt
```
### 3. Docker Environment
Alternatively, use the Docker image provided in the repository:
```bash
# Build the Docker image
docker build -t flatscraper .
# Run the Docker container
docker run -it --rm flatscraper
```
## 🎉 Have Fun and Happy Scraping!
Wishing you a great time and speedy flat searching with the bot. If you have any questions or suggestions, feel free to [open an issue on GitLab](https://gitlab.com). I'll respond as soon as possible.
*Happy scraping! 🚀*