123
 123

Tip: 看不到本站引用 Flickr 的图片? 下载 Firefox Access Flickr 插件 | AD: 订阅 DBA notes --

2010-06-17 Thu

22:20 split column value (720 Bytes) » DBA@SKY-MOBI
今天一位同事找我了解关于拆解字段内容的问题: 有个字段,内容是 1|2|4 2|3|5 拆开然后group by 比如上面两行的结果是 字段  count 1   1 2   2 3   1 4   1 5   1 使用PostgreSQL的函数进行拆解,过程如下: test=> create table tbl_Test (col1 text); CREATE TABLE test=> insert into tbl_test values(‘1,2′); INSERT 0 1 test=> insert into tbl_test values(‘1,2′); INSERT 0 1 test=> select t,count(*) from (select regexp_split_to_table(‘1,2,2′,’,') t from tbl_test)t group by t; t | count —+——- 2 |     4 1 |     2 (2 rows) test=> select t,count(*) [...]
17:18 Compojure and MongoDB: Sample App (1498 Bytes) » myNoSQL
Compojure and MongoDB: Sample App:

My fascination for Clojure and a bit of NoSQL flavor made me mention this basic sample app:

We’ll create a small Compojure[1] application that will serve as a backend for the Todos application. The application data will be stored in MongoDB.

Code also available on ☞ GitHub.


  1. ☞ Compojure: a concise web framework for Clojure  ()
12:56 High availability for MySQL on Amazon EC2 – Part 1 – Intro (4064 Bytes) » MySQL Performance Blog

Like many, I have been seduced by the power and flexibility of Amazon EC2. Being able to launch new instances at will depending on the load, is almost too good to be true. Amazon has also some drawbacks, availability is not guaranteed and discovery protocols relying on Ethernet broadcast or multicast cannot be used. That means, it is not easy to have a truly highly available solution for MySQL on EC2. If a MySQL instance fails, here are some challenges that we face:

  1. Detect the failure
  2. Kill the faulty instance
  3. Free the shared resources (ex: EBS volumes)
  4. Launch a new instances
  5. Reconfigure the shared resources
  6. Reconfigure instance monitoring
  7. Reconfigure the database clients

Facing these challenges, a customer recently asked me if I could build a viable HA solution for MySQL on EC2. The goal is to have a cheap small instance monitor the availability of a large (any size) and taking measures to keep MySQL available. A few weeks later, I ended up with a solution that work and is decently elegant using Pacemaker and Heartbeat. The setup is fairly complex, being in the Amazon EC2 virtual world is not a simplification, far from. Because of the complexity, the story will be broken into multiple posts:

  1. Part 1 – Intro (this post)
  2. Part 2 – Setting up the initial instances
  3. Part 3 – Configuring the HA resources
  4. Part 4 – The instance restart script
  5. Part 5 – The instance monitoring script
  6. Part 6 – Publishing the MySQL server location
  7. Part 7 – Pitfalls to avoid

Hopefully, I should be able to write those posts quickly but since consulting is my primary duty, I don’t have much control over my workload.

Stay tuned!


Entry posted by Yves Trudeau | One comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks

11:18 Cassandra, HBase, and PNUTS Compared (1662 Bytes) » myNoSQL

From ☞ Amandeep Khurana a nice matrix comparing characteristics of Cassandra, HBase and PNUTS:

Recently, in a course that I’m taking at UC Santa Cruz, I got a chance to present the PNUTS paper and compare the system with Bigtable and Dynamo.

The Cassandra, HBase, and PNUTS matrix includes details about:

  • data model
  • consistency model
  • ACID semantics
  • storage
  • replication
  • fault tolerance
  • scalability
  • aplicability

These systems are also part of the ☞ Yahoo! Cloud Serving Benchmark, a benchmark that has been open sourced recently and is available on ☞ GitHub

08:04 A NoRM-MongoDB Repository Base Class (1285 Bytes) » myNoSQL
A NoRM-MongoDB Repository Base Class:

Always persisting object graphs using the embedded document strategy may not be such a good idea even if it sounds like a good strategy which even works well with MongoDB advanced filtering capabilities:

NoRM makes persisting POCOs to a MongoDB collection simple. Object graphs are automatically serialized to a BSON document.

Think about your data firstly.

06:00 恩墨科技成功帮助某金融用户恢复ASM故障 (3542 Bytes) » Oracle Life

作者:eygle 发布在 eygle.com

今天是节后的第一天,下午收到某客户的服务请求,客户的数据库无法启动,磁盘组出现如下错误:
*** 2010-06-17 14:03:58.522
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kfcChkAio01], [], [], [], [], [], [], []
ORA-15196: invalid ASM block header [kfc.c:5552] [hard_kfbh] [3057] [2147483648] [4 != 13]
这个错误是由于ASM元数据损坏导致的,kfcChkAio01这个提示可以看出错误和磁盘检查有关。猜测这个错误的含义是:kernel file check Check aio error。如果不是存储问题,那么就是磁盘组出现了异常。而这个案例正是磁盘组的异常导致的。

这种情况非常偶然和少见,恢复的难易程度取决于损坏元数据的多少,需要做的是通过kfed去校验和修复损坏的元数据。

很幸运我们在1个小时之内快速帮用户改好了损坏的元数据,成功挂载磁盘,启动了数据库。恩墨科技已经具备了帮助用户恢复ASM存储故障的能力。

这个错误的产生原因值得借鉴,我们要学会怎样规避一些异常的风险,规避风险比解决问题更重要
  • 客户在大批量导入数据时数据库挂起
  • 客户shutdown abort停止了数据库实例
  • 客户shutdown abort关闭了ASM实例
在这样的步骤之后,数据库无法启动了。

这样的案例2月份我曾经遇到过一次,另外一个银行客户,同样在RAC环境中,由于多次异常关闭数据库之后,数据库因为ORA-00600 6006错误无法启动,我们同样艰难的帮助用户恢复了数据。

所以这里我想说的是,学会如何规避风险往往比解决问题更重要,因为数据库的风险太多且无处不在




相关文章|Related Articles

评论数量(4)|Add Comments

本文网址:

03:53 Berlin Buzzwords Presentations (1895 Bytes) » myNoSQL

The organizers of the Berlin Buzzwords NoSQL event have set up a ☞ wiki page with links to all presentations.

My 5 top favorites:

What are yours?

03:48 Oracle Mutex实现机制 (4727 Bytes) » Hello DBA

我们都知道Latch是Oracle用来在内存中做串行控制的机构,从10g R2开始,Oracle引入了一个新的技术-Mutex。Mutex并不是Oracle的发明,而是系统提供的一个底层调用,Oracle只是利用它实现串行控制的功能,并替换部分Latch。

Mutex中有两个变量:分别是Holider identifer和Reference count,Holider identifer记录持有mutex的SID,而Reference count是一个计数,记录了当前正在以share方式访问mutex的数量,每当session以share方式持有mutex时,计数会加1,而释放时会减1。如果Reference count大于零,则表示该内存结构正在被Oracle pin住。

我们看一段伪代码,演示mutex的申请过程:

Function Mutex_get(mutex_name)
{
  if mutex.holder:=SID
    case mode:
    'exclusive':
      if mutex.ref_count=0
        return TRUE
      else
        mutex.holder.clear;
        reture FALSE
      end if
    'share':
      mutex.ref_count++
      mutex.holder.clear
      return TRUE
    end case
  else
    reture FALSE
  end if
}

Mutex是如何实现串行控制的,实际上它是利用了操作系统的一个原子操作CAS(compare-and-swap)实现的。我们看到函数的开始处:mutex.holder:=SID,将SID赋值给mutex的Holider Identifer,这里就是一个原子的CAS操作,首先比较mutex.holder是否为空,如果不为空则赋值session的SID。CAS操作由OS来保证其原子性,在同一时刻这个操所是串行的。如果这个赋值操作失败,整个申请过程失败。赋值成功后,如果是share方式,则mutex.ref_count加1,并清空mutex.holder,如果是exclusive方式,需要判断mutex.ref_count是否为零(是否被pin住),如果大于0,则失败,并清空mutex.holder,如果等于0,则成功,这时不清空mutex.holder,保持当前session对mutex的exclusive占用,直到释放为止。

Mutex相比latch带来了以下的好处:

1.更少的资源消耗,mutex与latch不同,它不是独立存在的,而是在每个内存结构中,并随着内存结构创建和释放,mutex同时也被创建和释放。mutex暂用的空间比latch小很多,创建和释放消耗更少的资源。

2.有效降低竞争,因为mutex是每个内存结构中的一部分,这样意味着mutex的数量可以有很多,而不同于latch,一个latch需要管理很多个内存结构,当你访问同一latch管理的不同内存结构时,也会发生竞争,而mutex则不会。另外,因为latch的数量有限,很多时候latch本身的竞争会很厉害,之前,我们只能增加latch数量或者减少latch持有的时间,而现在,mutex是一个更好的选择。

3.更快的pin,当block被访问时,它必须被pin在buffer cache中,当一个cursor执行时,它也必须被pin在library cache中,如果大量并发频繁执行同一个cursor,library cache pin会耗费大量的CPU资源。而mutex使用reference count来解决并发访问的问题,只要它大于零,就表示它已经被pin在了内存中,不能被交换出去。而且mutex.ref_count++这个操所是非常快的,只占用非常少的资源。

Mutex申请的过程和latch类似,同样需要spin和sleep,不同的是Oracle硬编码了mutex spin的次数为255次(Latch spin的次数默认为2000,由隐含参数_spin_count控制)。latch sleep会随着等待次数的逐步增加,每次sleep的时间也会逐步增加。而mutex sleep则比较特别,它有三个选项,分别是yield CPU,sleep或者block other process,允许开发人员来决定采用哪种选项。

由于在某些RISC的操作系统中(HP-UNIX),由于系统不支持CAS操作,Oracle通过创建一个latch pool来模拟了CAS操作,被称为KGX latch,如果你发现系统中存在这种latch竞争,说明操作系统不支持CAS操作,可以通过_kks_use_mutex_pin关闭mutex。

mutex主要使用在library cache中,用来取代原来的library cache pin和library cache lock,关于library cache中锁的实现机制,我会在另外一篇文章中说明。

–EOF–

注:有关CAS,可以参考wiki上的说明。

03:06 Video: 2 Hours Riak Tutorial (1344 Bytes) » myNoSQL
Video: 2 Hours Riak Tutorial:

A must see tutorial on Riak by Sean Cribbs.

Compare that with Riak in 10 minutes:

01:53 Release: AllegroGraph 4.0, 100% ACID (3318 Bytes) » myNoSQL

AllegroGraph RDFStore[1], a solution at the crossing of RDF stores[2] and graph databases, has recently release a new major update featuring:

  • AllegroGraph is 100 percent ACID, supporting Transactions: Commit, Rollback, and Checkpointing. See the new tutorials for the Java and Python clients
  • Full and Fast Recoverability
  • 100% Read Concurrency, Near Full Write Concurrency
  • Online Backups
  • Dynamic and Automatic Indexing – All committed triples are always indexed (7 indices)
  • Advanced Text Indexing – Lucene style but faster, text indexing per predicate. See the new tutorials for the Java and Python clients
  • Duplicate Triple deletion while indexing
  • All Clients based on http REST Protocol – Java, Sesame, Jena, and Python
  • Completely multi-processing based (SMP) – Automatic Resource Management for all processors and disks, and optimized memory use. See the new performance tuning guide here, and new server configuration guide here
  • Column-based compression of indices similar to column-based RDBMS – reduced paging, better performance
  • Dedicated and Public Sessions – In dedicated sessions users can work with their own rule sets against the same database
  • Python Client Improvements – We now provide a full Python interface. The API is based on the Java Sesame interface and includes Spatial-Temporal and Social Network support
  • LUBM Benchmarks – Updated for this release

As a side note, in the graph databases space, AllegroGraph is not the first one supporting ACID, Neo4j being fully transactional for quite a while[3].

2010-06-16 Wed