Search posterous

Search all posts and users. Type a name, type a favorite song title, whatever! See what comes up.
  

More posterous blogs











More recommended blogs »

Here are posterous posts filed under simpledb...

hdknr says...

今日は, SimpleDBを使ってみることにした. SimpleDBは, 要はクラウドの中にスプレッドシートを持って参照や更新ができるようなイメージ. 公式ドキュメントのココのページの図が特徴を良くあらわしている.

  • スプレッドシート名がdomain
  • 行に相当するものがitem, 列に相当するものがattribute
  • 非正規形, つまり1つのattributeに対して複数のvalue持つことができる
  • SQLでできるように, 条件での絞り込みや並び替えをサポートするクエリAPIがある
  • スキーマレス, メンテフリー, インデックス設計, サイジングとかないので簡単
  • pay as you go, 最初の1GBはタダ

Filed under: SimpleDB

hdknr says...

(dev)hdknr@domU-12-31-39-00-D9-A1:~/.ve/dev/src$ hg clone
https://hdknr@bitbucket.org/david/django-storages/
destination directory: django-storages
requesting all changes
adding changesets
adding manifests
adding file changes
added 43 changesets with 105 changes to 51 files
updating working directory
32 files updated, 0 files merged, 0 files removed, 0 files unresolved
(dev)hdknr@domU-12-31-39-00-D9-A1:~/.ve/dev/src$ cd django-storages/
(dev)hdknr@domU-12-31-39-00-D9-A1:~/.ve/dev/src/django-storages$ ls -al
total 68
drwxr-xr-x 6 hdknr users 4096 2009-10-11 05:20 .
drwxr-xr-x 3 hdknr users 4096 2009-10-11 05:20 ..
-rw-r--r-- 1 hdknr users 562 2009-10-11 05:20 AUTHORS
drwxr-xr-x 2 hdknr users 4096 2009-10-11 05:20 backends
drwxr-xr-x 2 hdknr users 4096 2009-10-11 05:20 docs
drwxr-xr-x 4 hdknr users 4096 2009-10-11 05:20 examples
drwxr-xr-x 3 hdknr users 4096 2009-10-11 05:20 .hg
-rw-r--r-- 1 hdknr users 103 2009-10-11 05:20 .hgignore
-rw-r--r-- 1 hdknr users 1539 2009-10-11 05:20 LICENSE
-rw-r--r-- 1 hdknr users 315 2009-10-11 05:20 README
-rw-r--r-- 1 hdknr users 21229 2009-10-11 05:20 S3.py
-rw-r--r-- 1 hdknr users 988 2009-10-11 05:20 setup.py


(dev)hdknr@domU-12-31-39-00-D9-A1:~/.ve/dev/src/django-storages$ python
setup.py install
zip_safe flag not set; analyzing archive contents...

Installed
/home/hdknr/.ve/dev/src/django-storages/setuptools_hg-0.2-py2.5.egg
running install
running bdist_egg
running egg_info
creating django_storages.egg-info
writing django_storages.egg-info/PKG-INFO
writing top-level names to django_storages.egg-info/top_level.txt
writing dependency_links to django_storages.egg-info/dependency_links.txt
writing manifest file 'django_storages.egg-info/SOURCES.txt'
writing manifest file 'django_storages.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-i686/egg
running install_lib
running build_py
creating build
creating build/lib
copying S3.py -> build/lib
creating build/lib/backends
copying backends/__init__.py -> build/lib/backends
copying backends/s3boto.py -> build/lib/backends
copying backends/couchdb.py -> build/lib/backends
copying backends/symlinkorcopy.py -> build/lib/backends
copying backends/s3.py -> build/lib/backends
copying backends/database.py -> build/lib/backends
copying backends/mosso.py -> build/lib/backends
copying backends/overwrite.py -> build/lib/backends
copying backends/ftp.py -> build/lib/backends
copying backends/mogile.py -> build/lib/backends
copying backends/image.py -> build/lib/backends
creating build/bdist.linux-i686
creating build/bdist.linux-i686/egg
creating build/bdist.linux-i686/egg/backends
copying build/lib/backends/__init__.py ->
build/bdist.linux-i686/egg/backends
copying build/lib/backends/s3boto.py -> build/bdist.linux-i686/egg/backends
copying build/lib/backends/couchdb.py -> build/bdist.linux-i686/egg/backends
copying build/lib/backends/symlinkorcopy.py ->
build/bdist.linux-i686/egg/backends
copying build/lib/backends/s3.py -> build/bdist.linux-i686/egg/backends
copying build/lib/backends/database.py ->
build/bdist.linux-i686/egg/backends
copying build/lib/backends/mosso.py -> build/bdist.linux-i686/egg/backends
copying build/lib/backends/overwrite.py ->
build/bdist.linux-i686/egg/backends
copying build/lib/backends/ftp.py -> build/bdist.linux-i686/egg/backends
copying build/lib/backends/mogile.py -> build/bdist.linux-i686/egg/backends
copying build/lib/backends/image.py -> build/bdist.linux-i686/egg/backends
copying build/lib/S3.py -> build/bdist.linux-i686/egg
byte-compiling build/bdist.linux-i686/egg/backends/__init__.py to
__init__.pyc
byte-compiling build/bdist.linux-i686/egg/backends/s3boto.py to s3boto.pyc
byte-compiling build/bdist.linux-i686/egg/backends/couchdb.py to couchdb.pyc
byte-compiling build/bdist.linux-i686/egg/backends/symlinkorcopy.py to
symlinkorcopy.pyc
byte-compiling build/bdist.linux-i686/egg/backends/s3.py to s3.pyc
byte-compiling build/bdist.linux-i686/egg/backends/database.py to
database.pyc
byte-compiling build/bdist.linux-i686/egg/backends/mosso.py to mosso.pyc
byte-compiling build/bdist.linux-i686/egg/backends/overwrite.py to
overwrite.pyc
byte-compiling build/bdist.linux-i686/egg/backends/ftp.py to ftp.pyc
byte-compiling build/bdist.linux-i686/egg/backends/mogile.py to mogile.pyc
byte-compiling build/bdist.linux-i686/egg/backends/image.py to image.pyc
byte-compiling build/bdist.linux-i686/egg/S3.py to S3.pyc
creating build/bdist.linux-i686/egg/EGG-INFO
copying django_storages.egg-info/PKG-INFO ->
build/bdist.linux-i686/egg/EGG-INFO
copying django_storages.egg-info/SOURCES.txt ->
build/bdist.linux-i686/egg/EGG-INFO
copying django_storages.egg-info/dependency_links.txt ->
build/bdist.linux-i686/egg/EGG-INFO
copying django_storages.egg-info/not-zip-safe ->
build/bdist.linux-i686/egg/EGG-INFO
copying django_storages.egg-info/top_level.txt ->
build/bdist.linux-i686/egg/EGG-INFO
creating dist
creating 'dist/django_storages-1.0-py2.5.egg' and adding
'build/bdist.linux-i686/egg' to it
removing 'build/bdist.linux-i686/egg' (and everything under it)
Processing django_storages-1.0-py2.5.egg
creating
/home/hdknr/.ve/dev/lib/python2.5/site-packages/django_storages-1.0-py2.5.egg
Extracting django_storages-1.0-py2.5.egg to
/home/hdknr/.ve/dev/lib/python2.5/site-packages
Adding django-storages 1.0 to easy-install.pth file

Installed
/home/hdknr/.ve/dev/lib/python2.5/site-packages/django_storages-1.0-py2.5.egg
Processing dependencies for django-storages==1.0
Finished processing dependencies for django-storages==1.0

Filed under: SimpleDB

hdknr says...

Data Storage in Amazon SimpleDB vs. Data Storage in Amazon S3

Unlike Amazon S3, Amazon SimpleDB is not storing raw data. Rather, it takes your data as input and expands it to create indices across multiple dimensions, which enables you to quickly query that data. Additionally, Amazon S3 and Amazon SimpleDB use different types of physical storage. Amazon S3 uses dense storage drives that are optimized for storing larger objects inexpensively. Amazon SimpleDB stores smaller bits of data and uses less dense drives that are optimized for data access speed.

In order to optimize your costs across AWS services, large objects or files should be stored in Amazon S3, while smaller data elements or file pointers (possibly to Amazon S3 objects) are best saved in Amazon SimpleDB. Because of the close integration between services and the free data transfer within the AWS environment, developers can easily take advantage of both the speed and querying capabilities of Amazon SimpleDB as well as the low cost of storing data in Amazon S3, by integrating both services into their applications.

For the Beta release, a single Amazon SimpleDB domain may grow to 10 GB and you are initially allocated a maximum of 100 domains; however, over time these allocations may be raised. Please complete this form if you require additional domains.

でかいデータはS3、ポインタ情報の様な小さいデータはSimpleDB。

Filed under: SimpleDB

hdknr says...

(jail)hdknr@mailjail:~/.ve/jail/src$ svn checkout http://simpledb-dev.googlecode.com/svn/trunk/ simpledb-dev
A    simpledb-dev/simpledb-dev
A    simpledb-dev/simpledb-dev/src
A    simpledb-dev/simpledb-dev/src/simpledb_dev.py
A    simpledb-dev/simpledb-dev/src/portalocker.py
A    simpledb-dev/simpledb-dev/src/templates
A    simpledb-dev/simpledb-dev/src/templates/Query.xml
A    simpledb-dev/simpledb-dev/src/templates/GetAttributes.xml
A    simpledb-dev/simpledb-dev/src/templates/ListDomains.xml
A    simpledb-dev/simpledb-dev/src/templates/QueryWithAttributes.xml
A    simpledb-dev/simpledb-dev/src/templates/DeleteAttributes.xml
A    simpledb-dev/simpledb-dev/src/templates/error.xml
A    simpledb-dev/simpledb-dev/src/templates/DeleteDomain.xml
A    simpledb-dev/simpledb-dev/src/templates/CreateDomain.xml
A    simpledb-dev/simpledb-dev/src/templates/PutAttributes.xml

(jail)hdknr@mailjail:~/.ve/jail/src$ pip install web.pyRequirement already satisfied: web.py in /usr/lib/pymodules/python2.5
Installing collected packages: web.py
Successfully installed web.py
(jail)hdknr@mailjail:~/.ve/jail/src$ dpkg -l | grep webpy
ii  python-webpy                      1:0.32+dak1-1              Web framework for Python applications

まぁ、いいか。

(jail)hdknr@mailjail:~/.ve/jail/src/simpledb-dev/simpledb-dev/src$ pwd
/home/hdknr/.ve/jail/src/simpledb-dev/simpledb-dev/src

(jail)hdknr@mailjail:~/.ve/jail/src/simpledb-dev/simpledb-dev/src$ python simpledb_dev.pyhttp://0.0.0.0:8080/


hdknr@mailjail:~/.ve/jail/src$ curl http://localhost:8080/<?xml version="1.0"?>
<Response xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">;
        <Errors>
                <Error>
                        <Code>NoSuchVersion</Code>
                        <Message>SimpleDB/dev only supports version 2007-11-07 currently</Message>
                        <BoxUsage>0.0000219907</BoxUsage>
                </Error>
        </Errors>
        <RequestID>5ba318a0-001f-4df0-9542-886cbf6cd705</RequestID>
</Response>

(jail)hdknr@mailjail:~/.ve/jail/src/simpledb-dev/simpledb-dev/src$ python simpledb_dev.py test > /tmp/simpledb_dev.log

simpledb_dev.log の確認。

Running tests and printing out sample XML output...

Sample GetAttributes:

?AWSAccessKeyId=Test&DomainName=TestDomain&Timestamp=XXX&Version=2007-11-07&Signature=XXX&Action=GetAttributes&ItemName=0385333498

<?xml version="1.0"?>
<GetAttributesResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">;
        <GetAttributesResult>
                        <Attribute><Name>Rating</Name><Value>5 stars</Value></Attribute>
                        <Attribute><Name>Rating</Name><Value>*****</Value></Attribute>
                        <Attribute><Name>Rating</Name><Value>Excellent</Value></Attribute>
                        <Attribute><Name>Keyword</Name><Value>Book</Value></Attribute>
                        <Attribute><Name>Keyword</Name><Value>Paperback</Value></Attribute>
                        <Attribute><Name>Title</Name><Value>The Sirens of Titan</Value></Attribute>
                        <Attribute><Name>Author</Name><Value>Kurt Vonnegut</Value></Attribute>
                        <Attribute><Name>Year</Name><Value>1959</Value></Attribute>
                        <Attribute><Name>Pages</Name><Value>00336</Value></Attribute>
        </GetAttributesResult>
        <ResponseMetadata>
                <RequestId>3175e02f-a69f-4e88-ad98-a22ceb6d8a9f</RequestId>
                <BoxUsage>0.0000219907</BoxUsage>
        </ResponseMetadata>
</GetAttributesResponse>

Sample Query:

?AWSAccessKeyId=Test&DomainName=TestDomain&Timestamp=XXX&QueryExpression=%5B%27Year%27+%3D+%272007%27%5D+intersection+%5B%27Author%27+starts-with+%27%27%5D+sort+%27Author%27+desc&Version=2007-11-07&Signature=XXX&Action=Query

<?xml version="1.0"?>
<QueryResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">;
<QueryResult>
    <ItemName>B00005JPLW</ItemName>
    <ItemName>B000T9886K</ItemName>
</QueryResult>
<ResponseMetadata>
        <RequestId>4f4bcb4e-56cb-43a5-a9aa-7a5da26ca46e</RequestId>
        <BoxUsage>0.0000219907</BoxUsage>
</ResponseMetadata>
</QueryResponse>

Sample QueryWithAttributes:

?AWSAccessKeyId=Test&DomainName=TestDomain&Timestamp=XXX&QueryExpression=%5B%27Title%27+%3D+%27The+Right+Stuff%27%5D&Version=2007-11-07&Signature=XXX&Action=QueryWithAttributes

<?xml version="1.0"?>
<QueryWithAttributesResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">;
<QueryWithAttributesResult>
    <Item>
            <Name>1579124585</Name>
            <Attribute><Name>Rating</Name><Value>4 stars</Value></Attribute>
            <Attribute><Name>Rating</Name><Value>****</Value></Attribute>
            <Attribute><Name>Keyword</Name><Value>Hardcover</Value></Attribute>
            <Attribute><Name>Keyword</Name><Value>Book</Value></Attribute>
            <Attribute><Name>Keyword</Name><Value>American</Value></Attribute>
            <Attribute><Name>Title</Name><Value>The Right Stuff</Value></Attribute>
            <Attribute><Name>Author</Name><Value>Tom Wolfe</Value></Attribute>
            <Attribute><Name>Year</Name><Value>1979</Value></Attribute>
            <Attribute><Name>Pages</Name><Value>00304</Value></Attribute>
    </Item>
</QueryWithAttributesResult>
<ResponseMetadata>
        <RequestId>fc29f8ef-6298-4712-8f61-16ccc9e48c73</RequestId>
        <BoxUsage>0.0000219907</BoxUsage>
</ResponseMetadata>
</QueryWithAttributesResponse>

Sample PutAttributes:

?AWSAccessKeyId=Test&DomainName=TestDomain&Timestamp=XXX&Attribute.0.Name=Rating&Version=2007-11-07&Signature=XXX&Action=PutAttributes&Attribute.0.Value=%2A%2A%2A%2A%2A&Attribute.0.Replace=true&ItemName=B00005JPLW

<?xml version="1.0"?>
<PutAttributesResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">;
        <ResponseMetadata>
                <RequestId>2131099b-8f38-4b14-a803-f9bd09c26fce</RequestId>
                <BoxUsage>0.0000219907</BoxUsage>
        </ResponseMetadata>
</PutAttributesResponse>

Sample Query:

?AWSAccessKeyId=Test&DomainName=TestDomain&Timestamp=XXX&QueryExpression=%5B%27Pages%27+%3C+%2700320%27%5D&Version=2007-11-07&Signature=XXX&Action=Query

<?xml version="1.0"?>
<QueryResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">;
<QueryResult>
    <ItemName>0802131786</ItemName>
</QueryResult>
<ResponseMetadata>
        <RequestId>829b9b6e-c91e-47e8-8e27-58c59084136c</RequestId>
        <BoxUsage>0.0000219907</BoxUsage>
</ResponseMetadata>
</QueryResponse>

Sample CreateDomain:

?AWSAccessKeyId=Test&DomainName=TestDomainXXX&Timestamp=XXX&Version=2007-11-07&Signature=XXX&Action=CreateDomain

<?xml version="1.0"?>
<CreateDomainResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">;
        <ResponseMetadata>
                <RequestId>2e4e3435-a629-4f5b-9fb1-d752680567f7</RequestId>
                <BoxUsage>0.0000219907</BoxUsage>
        </ResponseMetadata>
</CreateDomainResponse>

Sample ListDomains:

?AWSAccessKeyId=Test&DomainName=TestDomain&Timestamp=XXX&Version=2007-11-07&Signature=XXX&Action=ListDomains

<?xml version="1.0"?>
<ListDomainsResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">;
<ListDomainsResult>
    <DomainName>TestDomain</DomainName>
    <DomainName>TestDomainXXX</DomainName>
 </ListDomainsResult>
<ResponseMetadata>
        <RequestId>8d429dd8-adb5-4224-a632-eae03e40b20b</RequestId>
        <BoxUsage>0.0000219907</BoxUsage>
</ResponseMetadata>
</ListDomainsResponse>

Sample DeleteDomain:

?AWSAccessKeyId=Test&DomainName=TestDomainXXX&Timestamp=XXX&Version=2007-11-07&Signature=XXX&Action=DeleteDomain

<?xml version="1.0"?>
<DeleteDomainResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">;
        <ResponseMetadata>
                <RequestId>9ebcae0b-4a2e-4441-a3a1-d856d8b2f774</RequestId>
                <BoxUsage>0.0000219907</BoxUsage>
        </ResponseMetadata>
</DeleteDomainResponse>


OK

Filed under: SimpleDB

hdknr says...

今回紹介するオープンソース・ソフトウェアはSimpleDB/dev、Python製のSimpleDBクローンだ。

SimpleDB/devはAmazon Webサービスの一つ、SimpleDBをローカルでも動作させられるものだ。SimpleDBはスキーマ情報を持たないデータベースで、簡単にデータの登録および取得ができる。

Picture 442.png
テストスクリプトを実行したところ。XMLデータが返ってくる

 

SimpleDB/devはデフォルトでポート番号8080で立ち上がる。サービスが立ち上がったら、開発用アドレスとしてlocalhostを設定しておき、開発を行えば良い。SimpleDB/devはSimpleDBの置き換えを目指すものではないので、開発用として考えよう。

仕様としては2007年11月07日版REST APIの機能をサポートしている。アクションは全てをサポートしており、HTTPレスポンスも同じものになるように作られている。なお、逆にない機能としてはSOAP APIへの対応、認証、タイムスタンプ形式のチェック、HTTPSとなっている。

同じような機能を持ったライブラリは他にも存在する。だがAPIとの接続形式は変わらないので実装言語に依らず、自由に選択ができるのが魅力だ。Rubyの開発でも、PHPの開発でもクライアントライブラリさえあれば容易に使えるだろう。SimpleDBを使った開発を行われる方は要チェックだ。

Filed under: SimpleDB

hdknr says...

SimpleDBでは, 作業はdomain(シートに相当)上で行うため, これらを作成したり,削除したり,既に作ったdomainに接続する必要がある. これらの操作を実現するbotoの関数を触ってみた. 解説は特にいらないほどシンプルなものである.

Filed under: SimpleDB

hdknr says...

あと, かなりのバッドノウハウだが, SimpleDBは文字列しか扱えないため, 数値のattributeの並び替えで問題が発生する(100より2の方が大きいと判断される, 辞書順だから). なので数値は固定桁,ゼロパディングな感じにする必要がある. (2ではなく00000002のようにする). 文字列しか扱えないのはSimpleDBの仕様とのことだがなんとかならないのだろうか.

Filed under: SimpleDB

hdknr says...

パフォーマンス

 SimpleDBのパフォーマンスは、MySQLのようなデータベースにどの程度匹敵するのだろうか?SimpleDBの大きな問題は、それぞれのリクエストがHTTPを用いて送信されるという点だ。これは、Amazonが強固なインフラを持っていたとしても、パフォーマンスの面では問題となりうる。

 この問題を解決する一つの方法は、ウェブアプリケーションとSimpleDBの間にキャッシュを設け、小さなリクエストを多数送るのではなくバッチでリクエストを発行することだろう。

 しかし、そもそもSimpleDBはリレーショナルデータベースに取って代わるために開発されたものではない。拡張性に富んだサービスで、一部のサービスやアプリケーションレイヤーを外部に移動させる機会を提供しているのである。ぜひ、Amazonのディスカッションフォーラムを読んで、SimpleDBを用いて実際にどんなことが行われているのかを調べて欲し

Filed under: SimpleDB