* Planning and Design ** Plan of Action 0. understand the current implementation and define data formats, required fields... 1. implment and benchmark a standalone db loader (with fake entries) 2. implement the reader into KStars 3. build toolchain to generate catalogs ** Brainstorming *** Thoughts about DB creation - use python library to make creating new catalogs *easy* - c++ bindings into kstars to use trixel indexation etc - unique ids and object id (for dublicates across catalogues) - incremental crossref at creation time *** Thoughts about Crossref - generic DSO row contains a dynamic field with a list of catalogues (as it is currently) - table for catalogues - cross ref when importing, heuristic and explicit info... - or crossref at creation time and then merge into master db or merge at runtime ** Design *** Database **** Tables - master table which contains unique physical objects - benchmarking has shown, that it is sensible to perform the merge ahead of time - merging can be done quite nicely with sqlite - catalog tables, name: ~cat_[catalog id]~ **** Catalogs - each catalog is own table with common fields - have unique integer ID and (not necessarily unique) precedence **** Collumns ***** Catalogs one single, inclusive format -> maybe discrimination later on - type :: [[file:kstars/skyobjects/skyobject.h::enum TYPE][Type]] of object, integer - coordintates :: ra, dec at J2000 period, doubles, degrees - magnitude :: double - name :: name of object, string - longer name :: longform name, optional, string - catalog identifier :: catalog internal number: for example NGC Number etc, optional, string - catlog id :: id of the catlog, integer - postion angle :: optoinal, double, degrees - id :: unique identifier, hash of preceding fields, binary - a :: major axis (arcminutes) - b :: minor axis (arcminutes) - ~OID~ :: the id of the physical object for cross reference, binary - trixel id :: trixel index in the skymesh, integer -> primary key - flux :: integrated flux, for radio sources, double - reserved :: thee additional reserved collumns Only Master: - dynamic :: wether to cache this object dynamically, boolean ***** Catalog Registry - id :: integer - name :: string - mut :: wether the catalog is mutable, integer - precedence :: in range (0, 1), real - author :: string, optional - source :: string, optional - description :: string, optional - enabled :: boolean - version :: integer, required **** Dedublication - table entries get *hashed* to create *unique* identifiers (~hash~) - entries have a second id field that is *unique* to the *physical object* the ~oid~ - handling of those dublicates is deferred to kstars - detection of the dublicates is performed by the catalog creation tooling (python) - catalogs have designated precedence: - simple double values -> enough flexibility and simpler than building some sort of precedence algo based on order relations **** Creation - python framework for basic code - each catalog implemented as a module that provides functions for individual stages - data aquisition handled by the modules - python interface into kstars routines for trixel indexation is provided ***** Stages ****** Data Aquisition - downloading etc of data to DISK! - may be performed concurrently - optional ****** Parsing - data is parsed and inserted into the database through some kind of interface - maybe provide some standard parsers - framework provides some coordinate conversion (kstars interface, *astropy* etc...) - trixel indexing happens automatically - creates separate tables for individual catalogs - ~oid~ is set to ~hash~ ****** Dedublication - successively performed on each catalog - catalogs implement their own dedublication and can query all other catalogs - routines for proximity or similar are provided - dublicate lists are merged, so that dedupe is transitive - dublicates will assume ~oid~ of the object with the highest precedence ****** Consolidation - each catalog is written into its own database file - the ~application_id~ is set to a kstars specific value - file extension is ~kscat~ ***** Dublicate Detection - general heuristic: proximity or similar, names - import order is not important - custom heuristics per catalog - for example api calls or sth... - some kind of access to other catalogs **** Interface in KStars - upon import: import table of catalog (and register catalog) + merge into master table - master table is used to load objects into kstars - when searching for objects by catalog/name: use catalog specific tables ***** Merge - by ~OID~ - select from the catalog with highest precedence ****** Later :IMPROVEMENTS: - could be made more granular later ***** Performance - trixel indexation - maybe LRU cache - the DB wrapper in KStars will only be used as a simple fetcher and for search etc - trixel index ~-1~ means "always show" *** Integration into KStars There are multiple ways of going about it here. The database handling will be wrapped in a class, much like in the current implementation but with less pointers and more move semantics. The integration into the composite/component system can be achieved in multiple ways. Common to them is, that there are essentially two views of the catalog data: dedublicated and raw. Upon adding / activating / deactivating a catalog, the master table with dedublicated entries has to be recreated. When searching for objects the **** Monolithic Approach - create a single component to contain all the skyobjects from the database for all catalogs - the objects will be dynamically loaded based on trixel id and there is one LRU cache - loading happens through the DB wrapper and is implemented through accessing the dedublicated master database - objects can be manually loaded into the component (i.e. after search) I favour this approach because it is simpler to implement and does one thing and one thing only! The polymorphic approach can always be implemented afterwards through refactoring. Also this makes it easier to incrementally replace the old catalog handling in kstars. ***** Pros - management complexity at one point - caching in one place - easier to implement ***** Cons - no real in-code separation between catalogs - does not seem to fit within the spirit of the component system - no trivial specialization possible -> do we even want that? - would complicate implementation - would violate catalog agnosticism - but is more flexible... - does not map too well onto the catalog separation **** Polymorphic Approach - create a component and subcomponents for each catalog - catalog subcomponents can have different specializations - subcomponents would communicate with the database wrapper - each subcomponent / catalog has own instanciation routines for its objects and own cache ***** Catalog Composite - contains a map of some sort of catalogs - one master map containing all catalogs - then a separated auto-catalog map and a map for custom catalogs - supports high level management - delegates most things to the catalogs ***** Pros - more flexible - allows for specialization - user catalog - internet resolver catalog ***** Cons - added complexity more moving parts - somehow mixing of concerns -> DSO catalogs and custom databases - promoting missuse -> addition of weird subclasses and mebers, ad-hoc solutinos, casting around types **** Interface in Separate Thread :IMPROVEMENTS: * Implementation Details ** Database *** Access Connection is opened as-needed. - this allows multiple instances of the db manager to use the db *** Error Handling - mostly through ~~ pairs, because failure is not exceptional when dealing with external resources *** Creation - db basic metadata (version, htmesh level etc) is set - user catalog is created *** Initialization in KStars to be performed after every catalog change - rebuild a view of all catalogs - create a master catalog by selecting objects unique by ~oid~ that have the highest precedence - master table is indexed by trixel and magnitude *** Representation in KStars - DSO too bulky and outdated -> add `CatalogObject` with less fields (pgc, ugc) - NO polymorphism because that would make things waaay more complicated - more classes to write - unique_ptrs... - ~catalog_id=-1~ means catalog is unknown **** Performance Tricks - sort objects by magnitude and display them only at a certain zoom level - because the objects get inserted into the vector in reverse order the sql query and index sort magnitude descending **** TO-DO ***** DONE make it configurable [[file:kstars/experiments/src/catalogscomponent.cpp::CatalogsComponent::draw(SkyPainter *skyp) {][here]], like [[file:kstars/skycomponents/deepskycomponent.cpp::return Options::showDeepSky();][here]] ***** HOLD support images :HOLD: ***** DONE implement object nearest [[file:kstars/skycomponents/skymapcomposite.cpp::oTry = m_DeepSky->objectNearest(p, rTry);][like here]] ***** DONE disable dsos at certain zoom level (off per defaul) ***** DONE maybe implement a by name index [[file:kstars/experiments/src/sqlstatements.h::const char _dso_by_name\[\] = "SELECT %1 FROM master WHERE trixel = "][for the by name query]] ***** DONE do more in the way of error handling [[file:datahandlers/catalogsdb.cpp::query.exec();][here]] ***** DONE make max dso count in [[file:kstars/tools/wutdialog.cpp::m_CatalogObjects = db.get_objects(m_Mag, 100);][wut dialog]] configurable - obsolete: uses mabglim ***** DONE make dso count in [[file:kstars/skycomponents/catalogscomponent.h::* there, so that references and pointers remain valid.][load static objects]] configureable ***** DONE add catalog info in details dialog ***** DONE better error handling (variant) and rollback! ***** HOLD do not auto-refresh master :HOLD: ***** DONE check defaults for a, b, flux ***** HOLD add checking of catalog meta [[file:kstars/catalogsdb/catalogsdb.cpp::std::pair CatalogsDB::read_catalog_meta_from_file(const QString &path)][here]] :HOLD: ***** DONE add TESTS for [[file:kstars/auxiliary/trixelcache.h::class TrixelCache][trixel cache]] ***** TODO mention the objectlist problem in the docs **** Custom Catalogs - always have highest precedence - are mutable - have version ~=-1~ - have ids ~>=1000~ ** General DSO Component - we hook into the kstars object list connundrum by only loading objects with names and below a certain apparent magnitude into memory and keep them there permanently, only those objects will be accessible in what's up today and the search ui ** Catalog dump format - exported as database with ~cat~ table containing the catalog data - the table ~catalogs~ contains one row with meta information (same format as the ~catalogs~ registry) - the ~pragma~ ~user_version~ is set to the database version - the ~pragma~ ~application_id~ is set to ~0x4d515158~ *** TODO Compression *** TODO Documentation *** DONE catalogue *** DONE fix file filter import *** TODO some more user feedback in the catalog csv import ** KNewStuff - hackily use the ~cataegory~ field to store the id: ~dso_~ ** Catalogs *** DONE NGC *** DONE IC *** DONE Messier *** DONE Sharpless *** DONE Remove old NGC/IC/Messier references ** What's Interesting - removed catalog functionality completely