Transaction

defination

transaction is a sequence of operations and performed as a signle logic unit of work

two main purposes

  • To provide reliable units of work that allow correct recovery from failures and keep a database consistent even in cases of system failure

  • To provide isolation between programs accessing a database concurrently

four properties

  • Atomic: requires that each transaction be “all or nothing”: if one part of the transaction fails, then the entire transaction fails, and the database state is left unchanged.
  • Consistent: ensures that any transaction will bring the database from one valid state to another. Integrity constraints must be maintained so that the database is consistent before and after the transaction.
  • Isolated: ensures that multiple transactions can occur concurrently without leading to inconsistency of database state
  • Durable: ensures that once the transaction has completed execution, the updates and modifications to the database are stored in and written to disk and they persist even is system failure occurs

transaction exception

  • Dirty read: occurs when a transaction is allowed to read data from a row that has been modified by another running transaction and not yet committed
  • Non-repeatable reads: when during the course of a transaction, a row is retrieved twice and the values within the row differ between reads
  • Phantom reads: occurs when, in the course of a transaction, two identical queries are executed, and the collection of rows returned by the second query is different from the first.
  • Lost update: A second transaction writes a second value of a data-item (datum) on top of a first value written by a first concurrent transaction, and the first value is lost to other transactions running concurrently which need, by their precedence, to read the first value

four isolation level

  • Serializable: serializability requires read and write locks (acquired on selected data) to be released at the end of the transaction. Also range-locks must be acquired when a SELECT query uses a ranged WHERE clause
  • Repeatable reads: keeps read and write locks (acquired on selected data) until the end of the transaction. However, range-locks are not managed, so phantom reads can occur.
  • Read committed: keeps write locks (acquired on selected data) until the end of the transaction, but read locks are released as soon as the SELECT operation is performed.
  • Read uncommitted: This is the lowest isolation level. In this level, dirty reads are allowed, so one transaction may see not-yet-committed changes made by other transactions.

locks

  • Exclusive locks: Exclusive locks are, as the name implies, exclusively held by a single entity, usually for the purpose of writing to the record. If the locking schema was represented by a list, the holder list would contain only one entry.
  • Shared locks: Shared locks differ from exclusive locks in that the holder list can contain multiple entries. Shared locks allow all holders to read the contents of the record knowing that the record cannot be changed until after the lock has been released by all holders

Index

defination

A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
CREATE [UNIQUE|FULLTEXT|SPATIAL] INDEX index_name
[index_type]
ON tbl_name (index_col_name,...)
[index_option]
[algorithm_option | lock_option] ...
index_col_name:
col_name [(length)] [ASC | DESC]
index_option:
KEY_BLOCK_SIZE [=] value
| index_type
| WITH PARSER parser_name
| COMMENT 'string'
index_type:
USING {BTREE | HASH}
algorithm_option:
ALGORITHM [=] {DEFAULT|INPLACE|COPY}
lock_option:
LOCK [=] {DEFAULT|NONE|SHARED|EXCLUSIVE}

indices

  • Ordered indices: Indices are based on a sorted ordering of the values
  • Hash indices: Indices are based on the values being distributed uniformly across a range of buckets. The buckets to which a value is assigned is determined by function called a hash function.

Questions

  • What are advantages of DBMS over traditional file based systems?
    • Data redundancy and inconsistency
    • Difficulty in accessing data
    • Data isolation – multiple files and formats
    • Integrity problems
    • Atomicity of updates
    • Concurrent access by multiple users
    • Security problems
  • What are super, primary, candidate and foreign keys?

    • a superkey is a attribute set which can identify a row
    • a candidate key is minimal superkey
    • a primary key is one of the candidate keys
    • foreign key is a field (or collection of fields) in one table that uniquely identifies a row of another table
  • What is the difference between primary key and unique constraints?

    • Primary key cannot have NULL value, the unique constraints can have NULL values. There is only one primary key in a table, but there can be multiple unique constrains
  • What is database normalization?

    • It is a process of analyzing the given relation schemas based on their functional dependencies and primary keys to achieve the following desirable properties: Minimizing Redundancy, Minimizing the Insertion, Deletion, And Update Anomalies
  • What is the difference between having and where clause?

    • HAVING is used to specify a condition for a group or an aggregate function used in select statement. The WHERE clause selects before grouping. The HAVING clause selects rows after grouping. Unlike HAVING clause, the WHERE clause cannot contain aggregate functions
    • SELECT name, section FROM tbl GROUP BY name, section HAVING COUNT(*) > 1
  • What is Join?

    • An SQL Join is used to combine data from two or more tables, based on a common field between them
    • 1
      2
      3
      4
      5
      SELECT StudentCourse.CourseID, Student.StudentName
      FROM StudentCourse
      INNER JOIN Customers
      ON StudentCourse.EnrollNo = Student.EnrollNo
      ORDER BY StudentCourse.CourseID;
    • full join
      full join

    • left join
      left join
    • right join
      right join
    • inner join
      inner join
  • What is a view in SQL? How to create one

    • A view is a virtual table based on the result-set of an SQL statement. We can create using create view syntax.
    1
    2
    3
    4
    CREATE VIEW view_name AS
    SELECT column_name(s)
    FROM table_name
    WHERE condition
  • What is a Trigger?

    • A Trigger is a code that associated with insert, update or delete operations. The code is executed automatically whenever the associated query is executed on a table. Triggers can be useful to maintain integrity in database.
  • What is a stored procedure?

    • It contains a set of operations that are commonly used in an application to do some common database tasks.
  • What is the difference between Trigger and Stored Procedure?

    • Unlike Stored Procedures, Triggers cannot be called directly. They can only be associated with queries.
  • What are clustered and non-clustered Indexes?

    • Clustered indexes is the index according to which data is physically stored on disk. Therefore, only one clustered index can be created on a given database table.
      Non-clustered indexes don’t define physical ordering of data, but logical ordering. Typically, a tree is created whose leaf point to disk records. B-Tree or B+ tree are used for this purpos
  • Query to find 2nd highest salary of an employee?

    1
    select min(price) from (SELECT Price FROM Products order by price desc limit 2);