A scraper application for crawling US Congress, industry associations, and think tanks press releases, hearings, markups, and bills for analytical purposes.
Time Range: Past content within one week (for most sources) and all future content.
Export Format: CSV, US Government, Think Tanks
Note: For easier navigation, think tank press content are located on a seperate page from the US Government releases.
- Wilson: Date, URL, and title of insight and analysis for the Wilson Center's Insights & Analysis page;
https://www.wilsoncenter.org/insight-analysis?_page=1&keywords=&_limit=10&programs=109 - Brookings: Date, URL, and title of insight and analysis for all content produced by the Brookings Institution page;
https://www.brookings.edu/search/?s=&post_type%5B%5D=&topic%5B%5D=&pcp=&date_range=&start_date=&end_date= - CSIS: Date, type, title, URL, and description of insight and analysis for all content by the Center For Strategic & International Studies;
https://www.csis.org/analysis - Asia Society: Title, URL, and description of insight and analysis for all publications by the Asia Society Policy Institute;
https://www.asiasociety.org/policy-institute/publications - ICAS: Date, type, title, URL, and description of insight and analysis for all content by the Institute for China-America Studies;
https://www.chinaus-icas.org/research-main/ - Atlantic Council: Date, category, title, URL, description, and tags of insight and analysis for all content by the Atlantic Council;
https://www.atlanticcouncil.org/insights-impact/research/, https://www.atlanticcouncil.org/insights-impact/commentary/
- Daily Digests: Date, URL, and text providing details of legislation introduced, reported, passed, and considered by the full House or Senate each legislative day;
https://www.congress.gov/bills-with-chamber-action/browse-by-date - Daily Bill Texts: Date, PDF file, and text providing detailed information on legislation considered in Daily Digests;
https://www.congress.gov/bill-texts-received-today - All Bills: Date, URL, and other details (eg. title, sponsor, committees, latest action) for all bills under total of "All Bills, Resolutions, and Amendments";
https://www.congress.gov/bills-with-chamber-action/browse-by-date
- Roll Call Votes: Date, name, and vote results of ALL Senate legislation passing through the 117th Congress;
https://www.senate.gov/legislative/LIS/roll_call_lists/vote_menu_117_1.htm - Floor Activity: Date, URL, and text providing details of senate floor proceedings;
https://floor.senate.gov/proceedings
- Commerce: Date, URL. title, and summary of press releases, hearings, and markups from the US Senate Committee on Commerce, Science, and Transportation;
https://www.commerce.senate.gov/pressreleases, https://www.commerce.senate.gov/hearings, https://www.commerce.senate.gov/markups - Foreign: Type of content (nomiations, treaties, legislation, hearing transcripts, business meeting transcripts, committee reports, other), date, URL (https://rs.http3.lol/index.php?q=aHR0cHM6Ly9naXRodWIuY29tL1Rvbnl4U3VuL2lmIGdpdmVu), and text for activities and reports from the US Senate Committee on Foreign Relations;
https://www.foreign.senate.gov/activities-and-reports - Banking: Date, URL, and title for press releases, hearings, and markups from the US Senate Committee on Banking, Housing, and Urban Affairs;
https://www.banking.senate.gov/newsroom/majority-press-releases, https://www.banking.senate.gov/hearings, https://www.banking.senate.gov/markups - Finance: Source of content (majority, minority), date, URL, and title for press releases and hearings from the US Senate Committee on Finance;
https://www.finance.senate.gov/chairmans-news, https://www.finance.senate.gov/hearings - HLSGA: Source of content (majority, minority), date, URL, and title for press releases and hearings from the US Senate Committee on Homeland Security & Government Affairs;
https://www.hsgac.senate.gov/media/majority-media, https://www.hsgac.senate.gov/hearings - Judiciary: Source of content (majority, minority), date, URL, and title for press releases and hearings from the US Senate Committee on the Judiciary;
https://www.judiciary.senate.gov/press/majority, https://www.judiciary.senate.gov/hearings - Intelligence: Date, URL, title, and summary for news from US Senate Select Committee on Intelligence;
https://www.intelligence.senate.gov/press, https://www.intelligence.senate.gov/hearings
- Energy: Date, URL, title, and summary of press releases, hearings, and markups from the US House Committee on Energy;
https://energycommerce.house.gov/newsroom/press-releases, https://energycommerce.house.gov/committee-activity/hearings, https://energycommerce.house.gov/committee-activity/markups - Financial Services: Date, URL, title, and summary of press releases, hearings, and markups from the US House Committee on Financial Services;
https://financialservices.house.gov/news/, https://financialservices.house.gov/calendar/?EventTypeID=577&Congress=117, https://financialservices.house.gov/calendar/?EventTypeID=575&Congress=117 - Foreign: Date, time (if applicable), title, and URL for press releases, hearings, and markups from the US House Committee on Foreign Affairs;
https://foreignaffairs.house.gov/press-releases, https://foreignaffairs.house.gov/hearings, https://foreignaffairs.house.gov/markups - Homeland: Date, title, and url for news, hearings, and markups from the US House Committee on Homeland Security;
https://homeland.house.gov/activities/hearings. https://homeland.house.gov/activities/markups, https://homeland.house.gov/news - Science, Space, and Tech: Date, URL, and title of press releases, hearings, and markups from the US House Committee on Science, Space, and Tech;
https://science.house.gov/news/press-releases, https://science.house.gov/hearings, https://science.house.gov/markups - Transportation: Date, URL, and title of press releases, hearings, and markups from the US House Committee on Transportation (Both Majority and Minority sites);
https://republicans-transportation.house.gov/news/documentquery.aspx?DocumentTypeID=2545, https://republicans-transportation.house.gov/calendar/?EventTypeID=542, https://republicans-transportation.house.gov/calendar/?EventTypeID=541, https://transportation.house.gov/news/press-releases, https://transportation.house.gov/committee-activity/hearings, https://transportation.house.gov/committee-activity/markups - Intelligence: Date, URL, title, and summary for news from US Permanent Select Committee on Intelligence;
https://intelligence.house.gov/
- Energy: Date, URL, title, and summary of press releases, hearings, and markups from the US Republican Committee on Energy and Commerce;
https://republicans-energycommerce.house.gov/news/, https://republicans-energycommerce.house.gov/hearings/, https://republicans-energycommerce.house.gov/markups/ - Foreign: Date, URL, title, and summary of updates, hearings, and markups from the US Republican Committee on Foreign Affairs;
https://gop-foreignaffairs.house.gov/updates/, https://gop-foreignaffairs.house.gov/hearing/, https://gop-foreignaffairs.house.gov/markup/ - Homeland: Date, title, URL, and description for press releases from the US House Committee on Homeland Security;
https://republicans-homeland.house.gov/committee-activity/press-releases/ - Science: Date, title, and url for news, hearings, and markups from the US House Committee on Science, Space, and Technology;
https://republicans-science.house.gov/news. https://republicans-science.house.gov/legislation/hearings, https://republicans-science.house.gov/legislation/markups
- SIA:Date, URL, and title of all headlines for the Semiconductor Industry Association;
https://www.semiconductors.org/news-events/latest-news/ - FCC: Date, URL, and title of all headlines for the Federal Communications Commission;
https://www.fcc.gov/news-events/headlines
- Clone repository.
- Run
./script.bash
in the terminal. - Using Crontab(Mac/Linux) or Task Scheduler(Windows), set up execution schedule to automatically run scraping job.