Framework agreement for acquisition of software for establishing a databank in SKAT
Background
The outcome of the Turnus Analysis within SKAT's Business Intelligence and Analysis (BI&A) department during Q4.14 and Q1.15 and corresponding Turnus Analysis within IT in Q2.15 and Q3.15, recommends SKAT to establish a ‘Databank’ (the system), based on a scalable and distributed data and processing technology, for example Hadoop. The Databank will become a new and key architectural component in SKAT's future Information Management architecture — but will also play a very significant role in the future IT-infrastructure as the component responsible to support data exchange across SKAT's business areas as well as to and from external partners.
The Databank is comprised of a technical infrastructure based on server-, network-, storage-capacity, etc. (the platform) and software (the application). The objective of this tender is solely the acquisition of the software and associated support services to the software. A separate and parallel public tender on capacity will request the necessary hardware, storage and network, etc. and associated offerings (services and support) necessary to build and support for the new Big Data Management platform. The capacity tender also cover demands to scale the platforms capacity (up or down) in order to right size the platform to SKATs future requirements.
The Databank combined with SKAT's current Information Management technology portfolio will provide data, reporting and analytical capabilities, including:
— Data exchange
— Ad hoc queries
— Business Intelligence
— Advanced Analytics
— Decision Automation
SKAT's strategy and ambitions in respect to these areas requires implementation of the Databank. A component within the Databank will function as a consolidated and centralized data integration hub between existing and future business systems and SKAT's external partners. Besides becoming the centralized data hub for operational data, the Databank support BI&A's business solutions for reporting, analysis and analytics mentioned above.
Today, SKAT heavily relies on its IT-sourcing partners when data has to be applied outside the specific business systems. By taking ownership of its data, SKAT want to achieve new degrees of freedom to exploit and control data by:
— Ability to measure data quality, without direct access to the production environments
— Shorter delivery of new BI&A solutions.
— Data shared by business systems, can be accessed from one source, and will reduce integration complexity by reducing the number of integration points.
— Reduced complexity will result in increased flexibility, security and stability in SKAT's IT landscape.
The new setup with the Databank is designed to extend flexibility, functionality and insight compared to the current data warehouse application in SKAT. The Databank in design, must be able to capture and store raw data at a large scale, perform high volume data transformations, define the structure of the data at the time of usage and scale data to extreme volumes at low cost. The Databank will be designed to become a key component within SKAT's IT-architecture by introducing its role as an enterprise data and integration hub to all data exchange between business systems and an active archive for all business systems.
Becoming business critical, the Databank has to mitigate:
— Loss of data due to the crash of hardware or software, requiring fault tolerance on racks, servers and hard disks within the individual datacenters.
— Reduced or non-availability of data due to downtime or breakdown of data centers, which can ensured by distribution of the Databank on multiple data centers with replication/commit of data in ‘near real time’ between datacenters.
One of the conclusions in the Turnus Analysis in BI&A is the advice to acquire and establish the Databank as the future Big Data Management platform and reuse existing investments in technologies, e.g. Sybase ASE/IQ, Informatica and Business Objects — and leverage existing investments in knowledgebase, experience and competencies. The acquisition of a modern and distributed software solution is necessary in order to be able to deliver on the vision and strategy in an agile and cost-effective manner.
The acquisition
The first step in the process to establish the Databank is to acquire a software solution to handle increasing data volumes, integration complexity, analytical and computational requirements that become more and more business critical. To ensure growth, handle complexity, computational requirements and fault tolerance at a reasonable TCO (Total Cost of Ownership). A distributed storage and processing solution are expected to meet these requirements and SKAT is open to any software solution solving these basic but critical requirements.
The Databank will become a key component responsible for collection, storing and structuring of all data from SKAT's internal production systems and to and from external systems and partners. The storage and processing model must be able to store any volume (TB->PB) and any type (videos, pictures, sound, documents, etc.) of data in its original form and format with linear scalability and cost base. For distributed data processing with schema on read must be supported for Data Warehouse purposes and for analytical and presentation purposes the software must support distributed processing of very large data sets on computer clusters built on commodity hardware. To support search on large datasets the software is required to include search functionality for advanced full-text search and near real-time indexing by supporting near linear scalability, auto index replication and automatic failover and recovery.
In general SKAT wants to acquire software to deliver the following capabilities:
— Tools to efficiently move data between various storage formats
— Scalable batch processing of structured and unstructured data, through SQL-like syntax and/or through higher level logic programming
— Real-time and scalable read/write storage
— Tag and enrich data with metadata for later exploration of sources
— Track data lineage, e.g. how data is moved/transformed, when and by who
— Low-latency analytic ability under multi-user Business Intelligence workloads
— Advanced analytics and modelling through in-memory batch and real-time processing
— Administration through user-friendly graphical user interfaces (GUI) and control scripts
— Role-based authorization with fine-grained control (see below)
The software must be able to support security at multiple levels:
— Data encryption at storage level
— A positive security model (‘Whitelist’)
— Central authorization on users/roles levels managed by Active Directory (AD)
— Row/column level security and to its extreme cell level security
— Masking/pseudomization of data on querying
— Security protocols (HTTPS/SSL/VPN/Kerberos) between servers and clients for encryption and authentication in the application stack
In addition, the software must be able to support and comply with the upcoming EU Data Protection Regulation (EDPR) for data privacy and data protection, including access logging and auditing of data usage. In general the software need to support the requirements for compliance with current regulatory requirements of the personal data legislation in Denmark.
The current data ‘portfolio’ are currently comprised of structured data. Near future requirements will include more complex data formats like video, sound/voice, pictures, documents, etc. As mentioned earlier, the software must be able to capture real-time data sources as well as huge batch loads and process accordingly.
The software must integrate with technologies currently used at SKAT. Informatica is used for data management/integration and BusinessObjects for end-user reporting. Data Marts are based in Sybase IQ and Sybase ASE. Analytics is through SAS as well as Revolution R. The use of data modeling tools like Embarcadero ER/Studio or equivalent must support the development process.
SKAT will need to establish an initial production platform with a specific number of nodes (e.g. 12 nodes) and must be able to grow from this initial size to a future scenario comprised of e.g. 50+ nodes. The software may not require specialized hardware and/or require highly specialized infrastructure partners. The daily operations and governance of Databanken will initially by carried out by SKAT, but may later be outsourced to an external partner. The implementation must support and consist of multiple and separated environments for development, test and production. The production environment has to support a fail-over setup of the production environment to mitigate single-point-of-failure scenarios.
The purpose of this public tender is only the acquisition of software and associated support services to support the Databank like:
— Availability and method of contact (named contact, within the hour support, etc.)
— Diagnostics assistance and
— On-site support
— Access to knowledge base (online, libraries, best practices, etc.)
— Free access to upgrades, updates and patches, etc.
This tender does only cover the acquisition of software and does not cover acquisition of hardware and the software has to be executed on SKAT commodity servers. A separate and parallel public tender will request the necessary capacity in respect to hardware, infrastructure and storage to support Databanken.
Deadline
Fristen for modtagelse af bud var på 2016-01-21.
Indkøbet blev offentliggjort på 2015-12-21.
Hvem?
Hvad?
Hvor?
Indkøbshistorik
Dato |
Dokument |
2015-12-21
|
Udbudsbekendtgørelse
|
2016-01-07
|
Supplerende oplysninger
|
2016-08-08
|
Supplerende oplysninger
|