Home

Welcome to the ticker_data wiki!

The core of this app is developed mostly in Python3, but leverages some additional code services that require familiarity with other development languages. Below is a discussion of the details surrounding the code and some key info on dependencies...

Please note: As described, I am NOTa hard-core Dev/Eng Ninja. I am a Product Strategy leader who knows how to code. Please don't critique my code design, architecture or quality based on the former. - Much appreciated in advance.

This is a Python3 app. It runs from the shell (I develop in both Linux & Windows 10, so it runs well in both envs, but I prefer and do most testing in Linux). - How to run the code...

Looking for screenshots? Here's some examples of the code running on LINUX cmd-line...here

Core code - Python3
- I've structured the code in a pure OOP architecture. (I'm a Product Leader & not a Dev/Engineer, so its not Google Eng/Dev production quality Python3 OOP code, but it stays true to the OOP paradigm (i.e. Classes, Instances, Class Methods/Attributes, Inheritance etc).
Data Science code - Pandas and NumPy (API for Python3)
- The basic Pandas DataFrame API code is pretty simple Python code, but selecting & manipulating data within DataFrames is all Pandas and NumPy native code and will/does-not look like Python3 code to anyone who doesn't know Pandas & NumPy.
Database injection & CRUD logic code - MongoDB (API for Python3)
- Although the Python3 MongoDB API is 100% Python, once the connection to the MongoDB database is live the real code being executed is almost exclusively Mongo JSON Document Query Language. (which is not Python3 at all).
Fast HTML data scraping - BeautifulSoup (bs4) for Python3
- Although HTML isn't a coding language (as per Python3), working with bs4 requires significant familiarly with HTML, HTML Doc Structure/tree/tag/objects/attributes etc. - There's just no getting around this (which is why I hate HTML doc scraping)...but sometimes you cant avoid it and it's just the only way to get to the raw data that you desire.
ML and AI capabilities is via Scikit-learn (sklearn) Python API.
- The sklearn code leverages a data corpora in support of countvectorizer stopwords logic from the Natural Language Toolkit (https://www.nltk.org). The stopwords.words("english") corpus is a normal ML supporting data entity in ML code/logic). The English Stopwords corpus data set MUST be loaded onto your file system. The Python3 interpreter functions for sklearn that leverage the nltk stopwords corpora must be able to find/access/read that corpus dataset during the Python3 interpreter's code pre-processing phase. Otherwise Python3 will complain and error-out before any real code executes. (see Dataset #70 here: http://www.nltk.org/nltk_data/). That code looks like this...
  `from nltk.corpus import stopwords'
  'sw = stopwords.words("english")'
- Since the ML/AI code is new & in heavy dev, you can comment it out if you cant figure out the nltk.corpus data-set download/install procedure. (see here: https://www.nltk.org/data.html).
Realtime exchange market data feeds - Leverages the V2 Alpaca API.
- Alpaca is a great API-1st stock Market data service (FINRA registered) designed specifically to handle heavy volume financial markets time series data, trading portfolios & algorithmic trade execution. (https://alpaca.markets/docs/about-us/).
- Alpaca supports multiple language API's, of which I'm implementing the Python3 API. (https://alpaca.markets/docs/). The Alpaca API language is relatively easy and quite Pythonic, but not python perse. So you need to learn their data manipulation language & scheme. The Alpaca API semantics can be a odd at times & some API functions are poorly documented (annoyingly). Being familiar with Alpaca would be helpful if you wish to focus on Real-time data/trading beyond scraping of exchange-delayed data via bs4. My Alpaca code is in the early phases of dev and I'm still deciding where/how I want to augment my overall Application design with real-time Alpaca market data.
- Note: You can't get everything from Alpaca. Generally it's very difficult to get access to FREE live streaming, realtime stock market ticker data (from any MD providers). I've tried & you need to pay lots of $$ for this type for MD feed. Alpaca is a good cost/capability compromise (free) & is built by real Silicon Valley coders.
- In order for the Alpaca code to work, you'll WILL need to register an account with Alpaca (which may be difficult for non-USA citizens). My Alpaca API account key's have been removed from the code, and invalidated.

Regards,
~Orville

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Welcome to the ticker_data wiki!

Clone this wiki locally