This site is 100% ad supported. Please add an exception to adblock for this site.

SYS202 Test 3

Terms

undefined, object
copy deck
Shared Locking
Shared locking allows multiple transactions to"get" objects but no "update" until an exclusive lock is obtained. Deadlocking occurs more frequently than with exclusive locking.
Data cube
a multidimensional data model that allows enterprise data to be modeled as a set of facts organized along a number of characteristic dimensions.
Concurrency Issues
The "lost update" problem, The "uncommitted dependency" problem, The "inconsistent analysis" problem
Data Warehouse
A historical and unchanging data storage system
Exclusive Locking
Exclusive locks typically holds for an entire transaction
Using XPath
/node1/node2/.../finalnode/ returns nodes below final node
XML is not
Not a replacement for relational databases, not a programming language
Data mining
refers to the automated discovery of implicit patterns and unstructured knowledge contained in large data stores.
Query Manager
the data warehouse software component that fields requests for data warehouse content and manages the search, retrieval,and presentation of retrieved warehouse data
Concurrency Control
Dealing with multiple users accessing and changing database
Operational Data sources
Operational, transaction databases of the enterprise plus external data sources
Cube
The organization of facts and dimensions
ETL (Extract, Transform, Load) tools
Extraction: selecting required data (tables, tuples)from operational data sources for transfer to the warehouse and access to database schemata to properly locate the required data (i.e., SQL SELECTstatements)Transformation: manipulating retrieved source data and "cleaning it up" (e.g., dealing with missing data)or changing thedata (converting from one base to another, summarizing fields, etc.)Loading: then storing the extracted and transformed data in the appropriate warehouse structure
Transaction Process Monitor
Software that handles transactions, two approaches 1) make changes as occur 2) accumulate changes in transaction log save until transaction completed
Support (Association Rule)
=P[A∪B] / total # of transactions , Percentage s which supports A => B from A∪B
Durability
Once a transaction commits, its updates survive (even a system crash).
HOLAP
Hybrid mix of Relational and Multidimensional data warehouse
dimension
like attribute for a data warehouse
ROLAP
Relational OLAP server use Relational Database front end for data warehouse
Uncommitted dependency
User B updates User A retrieves, User B's update is rolled back. Can happen in reverse order
Inconsistent Analysis
User A is summing three values, collects one value Z, User B updates value A,
MOLAP
Multi dimensional model of data warehouse
Step to data mining
Determine task-relevant data, select general approach, define measure of interestingness
Transaction Results (2)
1) commit - work finished, changes made 2) Rollback/Restart - work failed, undo work and transaction restarted
SOA Services
There is a separation of functionality based on the idea of a service. Clients consume services that are provided by one or more servers.A service can support multiple clients and may itself be a client of some other services.
XML
Semistructured Data Model, X-Markup Language, software and hardware independent means of information exchange
Subject-Orientated
data warehouses are organized around "subjects" - things or objects of interest to the enterprise for decision-making purposes in contrast to an enterprise process view point.
SOAP
Simple Object Access Protocol
Types of data mining
Visualization, Prediction, Cluster Analysis, Decision Trees, Association Rule
OLAP operations
roll-up, drill down, slice and dice
Non-volatile
no insertions,deletions, and update, just growth in size
Association Rule
form A => B which is interpreted as "tuples that satisfy condition (attribute/value pair) A are likely to satisfy attribute/value pair B Such rules are derived as empirical probability estimates, i.e, by counting the number of times A and B occur together in a sample of some (large) size
Four Properties of Transactions
Atomicity, Consistency, Isolation, Durability
Facts
are numerical measures; they are the fundamental data items pertaining to a particular subject, such as students, products,patients, etc.
OLAP
Online Analytical Processing, data summarization and aggregation
Web service
a function library residing on a server that can be called by remote applications
Consistency
Transactions preserve database consistency
Atomicity
Transactions are all or nothing
Time Stamping
the preferred method of concurrency control in a distributed environment.
Metadata repository
A database about data including a data dictionary. Operational database schemata
Drill-Down
this operator is the reverse of roll-up where we either step down a concept hierarchy, e.g., drilling down in the "US" cuboid or add a dimension to a cuboid, e.g.,adding Gender back into the "all_cube".
Star schema
A commonly used structure for representing a data warehouse.This is centered upon a fact table containing the raw data and a set of dimension tables.
Encapsulation of Services
The details of accomplishing a task are internal to a service and are not exposed to clients Service behavior can be modified without affecting clients as long as interfaces are not changed.
Slice and Dice
the slice operator works on one dimension of a data cube resulting in a sub-cube; the dice operator defines a new sub-cube by selecting on two or more dimensions.
Lost update
User A retrieves and User B retrieves, User A updates before User B updates (User A's update is lost)
SOAP Request
The SOAP request is just an HTTP POST request. The body of the POST request is an XML document. The root node is the SOAP envelope then a child node for the "body" then a child node for the function, children for the arguments
Service Oriented Architecture
An application can be "factored" into its functional components These functional components could be run on different machines Any component can use the services of another by sending messages and receiving results
Remote Procedure Calls (RPC)
Allow remote calls to the server without resending all the information
SOA ideas
Independent of hardware and software capabilities, using servers
Business Intelligence
Using data warehouse to gather data to make business decisions
XML Rules
There must be exactly one pair of tags for the root, and all other text is contained within these tags. Every start tag must have a matching end tag. Elements may not overlap
Data Warehouse DBMS
a storage organization to hold the data warehouse content in multiple levels of detail to accommodate different reporting and analysis requirements
User Tools
Reporting and query tools, OLAP, Executive Support System, Decision Support System, Data mining tools
Confidence (Association Rule)
=P[A∩B] / P[A] , Percentage c which supports A => B from A∩B
How time stamping works?
Every transaction is assigned a unique(system-wide) identifier at start time.Time stamping does not employ locks.
XPath
Query language that parses through XML
Uses of data warehousing
Capturing and archiving data, Analyzing data
Wildcards in XPath (/*/)
/*/*/finalnode/ all paths till final node (All nodes with finalnode as their 3rd node)
Serializability of transactions
A given interleaved execution of the transactions is serializable iff it produces the same result as some serial execution of the transactions.
XML + HTTP
Using XML to encode function calls and using an HTTP server to send and receive them
Time-invariant
data are archived to the data warehouse from operational databases at discrete points in time.
Wildcards in XPath (//)
//finalnode (All nodes that have final node as their final node)
Data Warehouse Architecture
Data warehouse DBMS, Operational data sources, Metadata repository, Query manager, ETL tools, Data marts, User tools
Deadlock
Each user is dependent on each other in a never-ending circle
Isolation
Transactions are isolated from one another (consistency and non-interference in the face of concurrent execution).
Locking
is the preferred method of concurrency control in a centralized, shared database environment.
fact table
contains the names of the facts, and their measures, as well as the key to each related dimension table.
Deadlock Breaking
Abort (rollback) any transaction waiting more than d time units or Choose a "victim" in the wait-for graph and abort (rollback)
The semistructured model
Data can be represented by any tree, highly flexible, not as structured as relational
Data Marts
a subset of a data warehouse for a particular enterprise unit (like a "view" in the relational model of data)
Typical Transactions
Retrieval, insertion, deletion, update
Transaction
A logical unit of work between Client and Server application
Approaches to concurrency control
Allow only one change at a time or interleaving actions that leave the database in consistent state
Integrated
an enterprise will likely have many databases which are designed,developed, built, and used operationally as distinct databases - billing, inventory control,sales, financial accounting, manufacturing, etc.
Condition Selection
//finalnode[//Title = 'Kyle']
Roll-up
aggregation on a data cube by climbing up a concept hierarchy, e.g. rolling up Home into a "US" group or by dimension reduction, e.g., eliminating an entire dimension such as Gender.

Deck Info

76

permalink