Tip: 看不到本站引用 Flickr 的图片? 下载 Firefox Access Flickr 插件 | AD: 订阅 DBA notes -- ![]()
2010-03-05 Fri
看到Kamus对SQLULDR2的留言后, 破有感触. 人们应当比较关注, 他们想要的功能用起来方便是否, 关键并不在于功能的多少. 而SQLULDR2的众多的命令行选项, 也确实有些让人发晕, 包括我自已.
为了方便大多数人使用, 简化了SQLULDR2的命令行帮助, 简化到如下所示.
SQL*UnLoader: Fast Oracle Text Unloader (GZIP), Release 3.0.1
(@) Copyright Lou Fangxin (AnySQL.net) 2004 - 2010, all rights reserved.
Usage: SQLULDR2 keyword=value [,keyword=value,...]
Valid Keywords:
user = username/password@tnsname
sql = SQL file name
query = select statement
field = separator string between fields
record = separator string between records
read = set DB_FILE_MULTIBLOCK_READ_COUNT at session level
sort = set SORT_AREA_SIZE at session level (UNIT:MB)
hash = set HASH_AREA_SIZE at session level (UNIT:MB)
array = array fetch size
rows = print progress for every given rows (default, 1000000)
file = output file name(default: uldrdata.txt)
log = log file name, prefix with + to append mode
text = output type (MYSQL, CSV, MYSQLINS, ORACLEINS, FORM, SEARCH).
parfile = read command option from parameter file
for field and record, you can use '0x' to specify hex character code,
\r=0x0d \n=0x0a |=0x7c ,=0x2c, \t=0x09, :=0x3a, #=0x23, "=0x22 '=0x27
对于专家而言, 可以用如下方式得到以前全部的命令行选项.
sqluldr2 help=yes
通过引入一个TEXT选项, 来针对不同格式的导出进行相关选项的设置, 不仅方便了大家使用, 也可以对SQLULDR2的功能有一个很直接的了解, 例如SQLULDR2可以导出数据给MySQL用, 或导出成Excel可以打开的标准CSV文件, 或是生成MySQL和ORACLE上的INSERT语句, 也可以按列显示记录, 或为一些特殊的搜索程序生成数据源.
再次感谢Kamus的好建议, 今年是支付宝的用户体验年, 应当从用户角度进行反思.
Relative Posts:
Is it Friday already? Where does the time go? Lots of stuff going on this week–here’s a few of the things that I found interesting.
Operating Systems
Russia Today-TV announced the existence of “Red Star”, the new OS developed in North Korea and based on Linux. I found this by way of Slashdot of course, citing the source as The Korea herald. According to the article it looks very much like the Windows UI, and features a “My Country” icon that allows connection to Korea’s closed internet-like network and the Woori office application. Slightly more information can be found there in the article N. Korea develops own OS.
The upcoming Ubuntu 10.04 (Lucid Lynx) has had the third alpha version released. The alpha versions are not for the faint of heart, and I have had to put in several bug reports myself already. It’s just part of the process though and will hopefully result in a rock solid release next month. The full release announcement can be found on the mailing list: see Lucid Alpha 3 released. A rundown of some of the things you can expect is on The H Online in Canonical releases Ubuntu 10.04 LTS Alpha 3.
Ars Technica has more on an item mentioned last week about the future of Open Solaris. OpenSolaris not dead; might not get all new Solaris features The exact details of which features may be excluded are not clear from the article but I did not get a chance to dig into some of the source material referenced so you may find more by clicking through them.
Internet
Opera 10.5 is out and Ars Technica has a detailed review of the new features, performance improvements and UI changes. There’s a lot of detail in the article, I suggest you read it for yourself. Hands on: Opera 10.50 makes impressive performance gains is the place to go. I think I’ll give the new Opera a try myself.
Local company makes good! I mentioned last week the takedown of the Waledac botnet, this week the Mariposa botnet suffered a similar fate. Ottawa’s own Defence Intelligence detected this botnet in 2009 and started an investigation that ended with the dismantling of the botnet by Spanish authorities. This is a much more important takedown than Waledac because Mariposa actually put sensitive and financial information at risk. You can read more of this story in Spanish arrests mark the end of dangerous botnet. Ottawa, represent!
Virtualization
Microsoft has released new drivers to improve the performance of Red Hat Enterprise Linux systems running under Hyper-V. The new drivers include new network and storage drivers, but still only support a single virtual CPU. These drivers are not officially included in, or supported by Red Hat, but they are fully support by Microsoft. Read Microsoft releases Hyper-V drivers for RHEL for more, and you can check out Microsoft’s side of the story on the technet blog in Red Hat Enterprise Linux and Hyper-V.
How Server Virtualization Impacts Storage. The title says it all. Penned by Heidi Biggar from Hitachi it has a good perspective on current industry trends and growth. We all know that consolidation methods, like virtualization, add bottlenecks on resources like CPU, memory and storage because they are all shared by many instances. Find out what Heidi has to say on the subject.
Data Center
Data Center Knowledge is reporting that the demand for data centers has been growing continually and no change is in sight for 2010. We’re certainly seeing the trend here with many companies making the push for full, real disaster recovery sites. There seems to be a real shortage of data center space and new ones are popping up slowly, so if you have lots of cash this may be a good business to get into. But I digress. The full story is in Strong Data Center Demand Seen for 2010.
Security
The H Online has a good article about the US government’s new cyber security policy. I haven’t had a chance to review the whole document as yet, but I have it bookmarked for later reading. US government publishes parts of its cyber security directive is where to go, which also links the source document.
Hardware
Tom’s Hardware has an interesting Op-Ed piece on the past, present and future of the 3D graphics industry. See Opinion: AMD, Intel, And Nvidia In The Next Ten Years for 10 pages of goodness.
Allyn Malventano has a very detailed review of Western Digital’s first foray into the SSD market. The new 256GB offering performed well through the tests but is still priced a bit too high for the market. The full review is at PC Perspective, see Western Digital SiliconEdge Blue 256GB SSD Review – WD enters the SSD market. You can find the Western Digital product information page on their site.
Well that about wraps it up for this edition. Comments are always welcome. Until next time, may your days be productive and the levels of administratium be non-toxic.
The 181st edition of Log Buffer has been published by Gary Myers on his Sydney Oracle Lab.
Having recently moved his blog, Gary approached the Log Buffer coordinator to volunteer for an edition because he knows that, with LB being a popular and established destination in the database blogoshphere, it would help him broadcast his new blog and welcome readers to it. You can do it too–simply send an email to the Log Buffer coordinator.
Here’s Gary’s Log Buffer #181.
So mothers keep your hackers at home
Don’t let them journey all alone
Tell them this world is full of danger
And to shun the repositories of strangers
– The Tag Set of Strangers,
(with apologies to) Nick Cave and the Bad Seeds
One of the things I love about Git is how I can add branches from remote repositories in mine at will without fearing messing up anything. The remote branches will not clash with mine, even if they share the same names, because they are referenced as repository/branch. However, as for anything else, you can still poke yourself in the eye if you try hard enough:
$ git remote bob $ git checkout -b bob/baroque Switched to a new branch 'bob/baroque' $ git fetch bob From ../bob * [new branch] baroque -> bob/baroque $ git checkout bob/baroque warning: refname 'bob/baroque' is ambiguous. Already on 'bob/baroque'
Here, I created a local branch called bob/baroque, which will end up having the same name as the branch baroque imported from Bob’s repository. Confusing, but not the end of the world. I can still see the different branches with gitk and access the local and remote branches via git checkout remotes/bob/baroque and git checkout remotes/bob/baroque.[1] The lesson to take from this, of course, is simply not to use slashes in branch names and sidestep the whole issue.
[1] The fiendish-minded reader probably wonders at this point what would happen if I were to create another local branch called remotes/bob/baroque. I would deserve to be shot, that’s what would happen.
With tags, surprisingly, the matter is much more prickly. Not only are tags not kept in the ”namespace” of their repository of origin, but git fetch has a very dangerous default behavior:
$ git log -n 1 somework
commit 483d008c6207554236232fef4e8cd22cfb4b9bb8
Author: Yanick Champoux <yanick@babyl.dyndns.org>
Date: Wed Mar 3 21:14:43 2010 -0500
some work on my repo
$ git fetch --tags bob
From ../b
- [tag update] somework -> somework
$ git log -n 1 somework
commit 5f7f8eddd2d44e359fe8bc0d1a2f1642d073cad9
Author: Yanick Champoux <yanick@babyl.dyndns.org>
Date: Wed Mar 3 21:15:25 2010 -0500
some work from Bob
Yup, that’s right, if there is a conflict, fetch --tags will silently clobber the local tags with their remote counterparts. Hope you remember all the commits you painstakingly tagged in the last six months… This behavior is so mind-bogglingly dangerous that, to this day, I wonder if I’m not missing something obvious.
Now, granted, it’s fairly rare to import tags from remote repositories. But there are instances, like when adding the gitpan history of your module to your repository, where it’s relevant. In those cases, for your own sake, make sure that the remote tags won’t clash with yours. Even better, don’t use --tags at all. Instead, do something akin to:
$ git ls-remote --tags bob | \ perl -nae'$F[1]=~s#refs/tags#bob#; `git tag $F[1] $F[0]`'
This will name remote tags using the repository/tag convention. Also, if the tag already exist, Git will complain and it won’t be clobbered by the new version.
Last week has featured two of the most interesting posts about MongoDB: first coming from Mathias Meyer (@roidrage) ☞ offline investigation of MongoDB and the second, a set of notes from running MongoDB in production published on ☞ Boxed Iced blog.
If you are interested in getting started with MongoDB, I’d encourage you take the time to go through Mathias’ post which covers the following aspects (I’ve also included a couple of comments)
-
collections and capped collections
Note: I couldn’t really understand the usage of namespaces and the implication on indexes
- data format
-
references
Note: I’d also strongly suggest taking a look at MongoDB documentation on ☞ schema design for more details
- indexes
- updates
- querying
-
durability
Note: we have covered before the MongoDB durability tradeoff in much detail
- replication
- caching
- backup
- storage
-
concurrency
Note: I’d really appreciate more details on this topic as it is not completely clear if all access (both read and writes) is serialized or just writes are serialized (or not?); also the impact on indexes is not clear either.
- memory
- GridFS
-
protocol access
Note: We argued before that access protocols are extremely important. MongoDB is one of the NoSQL solutions using a proprietary protocols and tries to “compensate” for that with tons of libraries
-
sharding
Note: probably biased, but I still wait for the moment MongoDB sharding would become at least beta.
MapReduce support seems to be missing from Mathias notes, but luckily we have that covered for you: MongoDB MapReduce tutorial.
While keeping in mind that some of these features are not unique to MongoDB and can be found in other systems, you should be ready to cross check your app requirements with the lessons learned by the guys at Boxed Ice:
-
namespace limits
We split our customers across (currently) 3 MongoDB databases because there is a namespace limit of 24,000 per database. This is essentially the number of collections + number of indexes.
-
initial sync/replication of large databases
Our databases are very large and it takes about 48-72 hours to fully sync all our current data onto a new slave in a different DC (via a site-to-site VPN for security). During this time you’re at risk because the slave is not up to date.
-
initial sync “slows” things
When doing a fresh sync from a master to a slave, we have observed a “slowdown” in our application response times.
-
index creation blocks
However, if you have an existing collection and create a new index on it then that process will block the database until the index is created.
-
efficiency of reclaiming diskspace
We have found that there is a massive discrepancy between a master and a freshly copied slave.
Even if not every application will have to deal with the size Boxed Ice is dealing, I couldn’t stop noticing that parts of the process of scaling MongoDB were really painful. Or as Sergio Bossa put it in ☞ one of the comments:
Anyways, it seems indeed you had almost the same problems you would had with a MySQL solution:
- Huge data to deal with.
- Manual sharding.
- Sync/replication delays.
So why didn’t you evaluate to switch to a more “large-scale” nosql solution like Cassandra or Riak?
Last but not least, drop me a note if you are planning to use or already using MongoDB in production and you’d like to share your experience with the NoSQL community
本文主要翻译自Jonathan Lewis的Blog Entry Log File Switch, 最后关于checkpoint部分,我添加了部分我自己的注释(以斜体字显示).
日志文件切换
几天前,我发布一个简单的查询语句,它通过查询v$log_history视图展示日志文件切换之间的时间间隔.如果你运行这个脚本,并且认为你的系统的日志文件切换频率不合适,那么你该如何处理呢?
如果时间间隔太短(由于日志文件频繁切换引发的checkpoint动作可能会导致DBWR进程过于活跃),可以通过新增新的更大的日志文件并删除旧的日志文件来解决.
如果切换的时间间隔太久,或者是时间间隔波动太大,而你希望使日志切换更加有规律,那么可以使用参数archive_lag_target(在Oracle 9i中引入)来为日志文件切换设置一个超时值(单位为秒).如果日志文件没有在指定的时间间隔内发生切换,那么系统会强制其切换日志,并且归档对应的已经使用的在线重做日志.
在此之前,人们一般使用类似于cron或者dbms_job一类的工具来实现(通过执行一条”alter system switch logfile“命令).从Oracle 9i开始,利用这个参数来实现就简单多了.
附录
- Instance Recovery Checkpoint
- Media Recovery Checkpoint
- Thread Checkpoint
- Interval Checkpoint
- Tablespace Checkpoint
- PQ Tablespace Checkpoint
- Close Database Checkpoint
- Incremental Checkpoint
- Local Database Checkpoint
- Global Database Checkpoint
- Object Reuse Checkpoint
- Object Checkpoint
实例恢复触发的检查点
介质恢复触发的检查点
单个instance关闭以及log file switch触发的检查点
log_checkpoint_timeout参数超时触发的检查点
表空间online/offline/read only/read write操作对应的检查点
不清楚具体原因
常规的检查点动作
对应于alter system checkpoint
对应于alter system checkpoint all命令的检查点.
对应于truncate table操作的检查点
对应于drop table的检查点操作
Oracle支持多种不同类型的checkpoint,这个列表会随着Oracle版本的不同而有所不同.Oracle 10.2.0.3的对应的checkpoint看似有如下这些:
在这些中间,我认为日志文件切换时引发的checkpoint可能是ThreadCheckpoint.
No related posts.
If we never have enough intro presentations to MongoDB, why would we have enough CouchDB videos?
Embedded below is a video of Will Leinweber presenting Relaxing with CouchDB (38 minutes)
We’ve never got enough introductions to NoSQL systems. Embedded below are the slides from Alex Sharp’s (@ajsharp): Intro to MongoDB presentation. Just to allow you quick overview, you can find below also the text only version.
Text-only version of Intro to MongoDB
-
Slide: 1
Intro to MongoDB
Alex Sharp
twitter: @ajsharp
-
Slide: 2
So what is MongoDB?
-
Slide: 3
First and foremost…
-
Slide: 4
IT’S THE NEW HOTNESS!!!
-
Slide: 5
omgomgomg
SHINY OBJECTS
omgomgomg
-
Slide: 6
MongoDB (from “humongous”) is a scalable, high-performance, open source, schema-free, document-oriented database.
- mongodb.org
-
Slide: 7
Philosophy
-
Slide: 8
Philosophy
“One size fits all” approach no longer applies
-
Slide: 9
Philosophy
Non-relational DBs scale more easily, especially horizontally
-
Slide: 10
Philosophy
Focus on speed, performance, flexibility and scalability
-
Slide: 11
Philosophy
Not concerned with transactional stuff and relational semantics
-
Slide: 12
Philosophy
DBs should be an on-demand commodity, in a cloud-like fashion
-
Slide: 13
Philosophy
Mongo tries to achieve the performance of traditional key-value stores while maintaining functionality of traditional RDBMS
-
Slide: 14
Features
-
Slide: 15
Features
Standard database stuff
-
Slide: 16
Features
Standard database stuff
Indexing
-
Slide: 17
Features
Standard database stuff
Indexing
replication/failover support
-
Slide: 18
Features: Document Storage
Documents are stored in BSON (binary JSON)
-
Slide: 19
BSON is a binary serialization of JSON-like objects
Features: Document Storage
-
Slide: 20
Features: Document Storage
This is extremely powerful, b/c it means mongo understands JSON natively
-
Slide: 21
Features: Document Storage
Any valid JSON can be easily imported and queried
-
Slide: 22
Features
Schema-less; very flexible
-
Slide: 23
Features
Schema-less; very flexible
no more blocking ALTER TABLE
-
Slide: 24
Features
Auto-sharding (alpha)
-
Slide: 25
Features
Makes for easy horizontal scaling
-
Slide: 26
Features
Map/Reduce
-
Slide: 27
Features
Very, very fast
-
Slide: 28
Features
Super easy to install
-
Slide: 29
Features
Strong with major languages
-
Slide: 30
Features
Document-oriented = flexible
-
Slide: 31
Features: Querying
Rich, javascript-based query syntax
-
Slide: 32
Features: Querying
Rich, javascript-based query syntax
Allows us to deep, nested queries
-
Slide: 33
Features: Querying
Rich, javascript-based query syntax
Allows us to do deep, nested queries
db.order.find( { shipping: { carrier: "usps" } } ); -
Slide: 34
Features: Querying
Rich, javascript-based query syntax
Allows us to deep, nested queries
db.order.find( { shipping: { carrier: "usps" } } );shipping is an embedded document (object)
-
Slide: 35
Features: Binary Object Store
Efficient binary large object store via GridFS
-
Slide: 36
Features: Binary Object Store
Efficient binary large object store via GridFS
i.e. store images, videos, anything
-
Slide: 37
Concepts
-
Slide: 38
Concepts: Document-oriented
Think of “documents” as database records
-
Slide: 39
Concepts: Document-oriented
Think of “documents” as database records
Documents are basically just JSON objects that Mongo stores in binary
-
Slide: 40
Concepts: Document-oriented
Think of “collections” as database tables
-
Slide: 44
Concept Mapping
RDBMS (mysql, postgres)
Tables
Records/rows
Queries return record(s)
MongoDB
Collections
Documents/objects
Queries return a cursor
???
-
Slide: 45
Concepts: Cursors
Queries return “cursors” instead of collections
-
Slide: 46
Concepts: Cursors
Queries return “cursors” instead of collections
A cursor allows you to iterate through the result set
-
Slide: 47
Concepts: Cursors
Queries return “cursors” instead of collections
A cursor allows you to iterate through the result set
A big reason for this is performance
-
Slide: 48
Concepts: Cursors
Queries return “cursors” instead of collections
A cursor allows you to iterate through the result set
A big reason for this is performance
Much more efficient than loading all objects into memory
-
Slide: 49
Concepts: Cursors
The
find()function returns a cursor object -
Slide: 50
Concepts: Cursors
The
find()function returns a cursor objectvar cursor = db.logged_requests.find({ 'status_code' : 200 })cursor.hasNext() // "true"cursor.forEach( function (item) {print(tojson(item))});cursor.hasNext() // "false" -
Slide: 51
Cool Features
-
Slide: 52
Cool Features
Capped collections
-
Slide: 53
Cool Features
Capped collections
Fixed-sized, limited operation, auto-LRU age-out collections
-
Slide: 54
Cool Features
Capped collections
Fixed-sized, limited operation, auto-LRU age-out collections
Fixed insertion order
-
Slide: 55
Cool Features
Capped collections
Fixed-sized, limited operation, auto-LRU age-out collections
Fixed insertion order
Super fast
-
Slide: 56
Cool Features
Capped collections
Fixed-sized, limited operation, auto-LRU age-out collections
Fixed insertion order
Super fast
Ideal for logging and caching
-
Slide: 57
Cool Uses
Data Warehouse
Mongo understands JSON natively
-
Slide: 58
Cool Uses
Data Warehouse
Mongo understands JSON natively
Very powerful for analysis
-
Slide: 59
Cool Uses
Data Warehouse
Mongo understands JSON natively
Very powerful for analysis
Query a bunch of data from some web service
-
Slide: 60
Cool Uses
Data Warehouse
Mongo understands JSON natively
Very powerful for analysis
Query a bunch of data from some web service
Import into mongo (mongoimport -f filename.json)
-
Slide: 61
Cool Uses
Data Warehouse
Mongo understands JSON natively
Very powerful for analysis
Query a bunch of data from some web service
Import into mongo (mongoimport -f filename.json)
Analyze to your heart’s content
-
Slide: 62
Cool Uses
Harmonyapp.com
Large rails app for building websites (kind of a CMS)
-
Slide: 63
Cool Uses
Hardcore debugging
Spit out large amounts of data
-
Slide: 64
Limitations
Transaction support
-
Slide: 65
Limitations
Transaction support
Relational integrity
-
Slide: 66
Resources
With the help of the community I’ve built an embedded cassandra service ideal for unit testing and perhaps other uses. I’ve also built a cleanup utility that helps wipe out all data before the service starts running so the combination of both provides isolation etc. Now each test process runs an in-process, embedded instance of cassandra.
While I do agree with the main goal behind this initiative (i.e. testability), I do think that a stub solution (take for example RockingChair a Ruby stub library for CouchDB) would be a much better fit and would perform a lot better. As the original post mentions, the embedded Cassandra service requires each test to be run in a separate JVM and as you can imagine that will be far from being quick.
2010-03-04 Thu
AnySQL.net
Oracle & Starcraft
Give you some color to see see!
Oracle Scratchpad
Oracle Life
Chanel [K]
Oracle Security Blog
MySQL Performance Blog
The Tom Kyte Blog
Delicious/Fenng/oracle
O'Reilly Databases
Red Hat Magazine
车东[Blog^2]
blue_prince
玉面飞龙的BLOG
木匠 Creative and Flexible
生活帮-LifeBang
Hey!! Sky!
dba on unix
Brotherxiao's Home
jametong's shared items in Google Reader
DBA Tools
Inside the Oracle Optimizer - Removing the black magic
DBA@Taobao
存储部落
OracleBlog.cn
知道分子
支付宝官方 Blog - 支付志
木匠的天空 Oracle Architect and Developer
Hello DBA
OS与Oracle
Cary Millsap
Guy Harrison's main page
eagle's home
dbthink
DBA Notes
OracleDBA Blog---三少个人涂鸦地!
The Pythian Blog
MyNoSQL
