hdknr’s posterous

 
Filed under

aws

 

AmazonS3 - Hadoop Wiki

Hadoop provides two filesystems that use S3.

S3 Native FileSystem (URI scheme: s3n)
A native filesystem for reading and writing regular files on S3. The advantage of this filesystem is that you can access files on S3 that were written with other tools. Conversely, other tools can access files written using Hadoop. The disadvantage is the 5GB limit on file size imposed by S3. For this reason it is not suitable as a replacement for HDFS (which has support for very large files).
S3 Block FileSystem (URI scheme: s3)
A block-based filesystem backed by S3. Files are stored as blocks, just like they are in HDFS. This permits efficient implementation of renames. This filesystem requires you to dedicate a bucket for the filesystem - you should not use an existing bucket containing files, or write other files to the same bucket. The files stored by this filesystem can be larger than 5GB, but they are not interoperable with other S3 tools.

AWS Hadoop Filesystem(HDFS)
- ネイティブ・ファイルシステム( s3n:// )
- ブロック・ファイルシステム ( s3:// )

Filed under  //   AWS   Hadoop   S3  

Comments [0]

Amazon's Dynamo - All Things Distributed

Dynamo: Amazon’s Highly Available Key-value Store

Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and Werner Vogels

Amazon.com

Filed under  //   Amazon   AWS   Dynamo  

Comments [0]

pythonでAWS. SimpleDBを触ってみる(1). - lolloo-htnの日記

今日は, SimpleDBを使ってみることにした. SimpleDBは, 要はクラウドの中にスプレッドシートを持って参照や更新ができるようなイメージ. 公式ドキュメントのココのページの図が特徴を良くあらわしている.

  • スプレッドシート名がdomain
  • 行に相当するものがitem, 列に相当するものがattribute
  • 非正規形, つまり1つのattributeに対して複数のvalue持つことができる
  • SQLでできるように, 条件での絞り込みや並び替えをサポートするクエリAPIがある
  • スキーマレス, メンテフリー, インデックス設計, サイジングとかないので簡単
  • pay as you go, 最初の1GBはタダ

Filed under  //   Amazon   AWS   boto   SimpleDB  

Comments [1]

AWS: django-storages : install

(dev)hdknr@domU-12-31-39-00-D9-A1:~/.ve/dev/src$ hg clone
https://hdknr@bitbucket.org/david/django-storages/
destination directory: django-storages
requesting all changes
adding changesets
adding manifests
adding file changes
added 43 changesets with 105 changes to 51 files
updating working directory
32 files updated, 0 files merged, 0 files removed, 0 files unresolved
(dev)hdknr@domU-12-31-39-00-D9-A1:~/.ve/dev/src$ cd django-storages/
(dev)hdknr@domU-12-31-39-00-D9-A1:~/.ve/dev/src/django-storages$ ls -al
total 68
drwxr-xr-x 6 hdknr users 4096 2009-10-11 05:20 .
drwxr-xr-x 3 hdknr users 4096 2009-10-11 05:20 ..
-rw-r--r-- 1 hdknr users 562 2009-10-11 05:20 AUTHORS
drwxr-xr-x 2 hdknr users 4096 2009-10-11 05:20 backends
drwxr-xr-x 2 hdknr users 4096 2009-10-11 05:20 docs
drwxr-xr-x 4 hdknr users 4096 2009-10-11 05:20 examples
drwxr-xr-x 3 hdknr users 4096 2009-10-11 05:20 .hg
-rw-r--r-- 1 hdknr users 103 2009-10-11 05:20 .hgignore
-rw-r--r-- 1 hdknr users 1539 2009-10-11 05:20 LICENSE
-rw-r--r-- 1 hdknr users 315 2009-10-11 05:20 README
-rw-r--r-- 1 hdknr users 21229 2009-10-11 05:20 S3.py
-rw-r--r-- 1 hdknr users 988 2009-10-11 05:20 setup.py


(dev)hdknr@domU-12-31-39-00-D9-A1:~/.ve/dev/src/django-storages$ python
setup.py install
zip_safe flag not set; analyzing archive contents...

Installed
/home/hdknr/.ve/dev/src/django-storages/setuptools_hg-0.2-py2.5.egg
running install
running bdist_egg
running egg_info
creating django_storages.egg-info
writing django_storages.egg-info/PKG-INFO
writing top-level names to django_storages.egg-info/top_level.txt
writing dependency_links to django_storages.egg-info/dependency_links.txt
writing manifest file 'django_storages.egg-info/SOURCES.txt'
writing manifest file 'django_storages.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-i686/egg
running install_lib
running build_py
creating build
creating build/lib
copying S3.py -> build/lib
creating build/lib/backends
copying backends/__init__.py -> build/lib/backends
copying backends/s3boto.py -> build/lib/backends
copying backends/couchdb.py -> build/lib/backends
copying backends/symlinkorcopy.py -> build/lib/backends
copying backends/s3.py -> build/lib/backends
copying backends/database.py -> build/lib/backends
copying backends/mosso.py -> build/lib/backends
copying backends/overwrite.py -> build/lib/backends
copying backends/ftp.py -> build/lib/backends
copying backends/mogile.py -> build/lib/backends
copying backends/image.py -> build/lib/backends
creating build/bdist.linux-i686
creating build/bdist.linux-i686/egg
creating build/bdist.linux-i686/egg/backends
copying build/lib/backends/__init__.py ->
build/bdist.linux-i686/egg/backends
copying build/lib/backends/s3boto.py -> build/bdist.linux-i686/egg/backends
copying build/lib/backends/couchdb.py -> build/bdist.linux-i686/egg/backends
copying build/lib/backends/symlinkorcopy.py ->
build/bdist.linux-i686/egg/backends
copying build/lib/backends/s3.py -> build/bdist.linux-i686/egg/backends
copying build/lib/backends/database.py ->
build/bdist.linux-i686/egg/backends
copying build/lib/backends/mosso.py -> build/bdist.linux-i686/egg/backends
copying build/lib/backends/overwrite.py ->
build/bdist.linux-i686/egg/backends
copying build/lib/backends/ftp.py -> build/bdist.linux-i686/egg/backends
copying build/lib/backends/mogile.py -> build/bdist.linux-i686/egg/backends
copying build/lib/backends/image.py -> build/bdist.linux-i686/egg/backends
copying build/lib/S3.py -> build/bdist.linux-i686/egg
byte-compiling build/bdist.linux-i686/egg/backends/__init__.py to
__init__.pyc
byte-compiling build/bdist.linux-i686/egg/backends/s3boto.py to s3boto.pyc
byte-compiling build/bdist.linux-i686/egg/backends/couchdb.py to couchdb.pyc
byte-compiling build/bdist.linux-i686/egg/backends/symlinkorcopy.py to
symlinkorcopy.pyc
byte-compiling build/bdist.linux-i686/egg/backends/s3.py to s3.pyc
byte-compiling build/bdist.linux-i686/egg/backends/database.py to
database.pyc
byte-compiling build/bdist.linux-i686/egg/backends/mosso.py to mosso.pyc
byte-compiling build/bdist.linux-i686/egg/backends/overwrite.py to
overwrite.pyc
byte-compiling build/bdist.linux-i686/egg/backends/ftp.py to ftp.pyc
byte-compiling build/bdist.linux-i686/egg/backends/mogile.py to mogile.pyc
byte-compiling build/bdist.linux-i686/egg/backends/image.py to image.pyc
byte-compiling build/bdist.linux-i686/egg/S3.py to S3.pyc
creating build/bdist.linux-i686/egg/EGG-INFO
copying django_storages.egg-info/PKG-INFO ->
build/bdist.linux-i686/egg/EGG-INFO
copying django_storages.egg-info/SOURCES.txt ->
build/bdist.linux-i686/egg/EGG-INFO
copying django_storages.egg-info/dependency_links.txt ->
build/bdist.linux-i686/egg/EGG-INFO
copying django_storages.egg-info/not-zip-safe ->
build/bdist.linux-i686/egg/EGG-INFO
copying django_storages.egg-info/top_level.txt ->
build/bdist.linux-i686/egg/EGG-INFO
creating dist
creating 'dist/django_storages-1.0-py2.5.egg' and adding
'build/bdist.linux-i686/egg' to it
removing 'build/bdist.linux-i686/egg' (and everything under it)
Processing django_storages-1.0-py2.5.egg
creating
/home/hdknr/.ve/dev/lib/python2.5/site-packages/django_storages-1.0-py2.5.egg
Extracting django_storages-1.0-py2.5.egg to
/home/hdknr/.ve/dev/lib/python2.5/site-packages
Adding django-storages 1.0 to easy-install.pth file

Installed
/home/hdknr/.ve/dev/lib/python2.5/site-packages/django_storages-1.0-py2.5.egg
Processing dependencies for django-storages==1.0
Finished processing dependencies for django-storages==1.0

Filed under  //   AWS   Python   S3   SimpleDB  

Comments [0]

Amazon SimpleDB

Data Storage in Amazon SimpleDB vs. Data Storage in Amazon S3

Unlike Amazon S3, Amazon SimpleDB is not storing raw data. Rather, it takes your data as input and expands it to create indices across multiple dimensions, which enables you to quickly query that data. Additionally, Amazon S3 and Amazon SimpleDB use different types of physical storage. Amazon S3 uses dense storage drives that are optimized for storing larger objects inexpensively. Amazon SimpleDB stores smaller bits of data and uses less dense drives that are optimized for data access speed.

In order to optimize your costs across AWS services, large objects or files should be stored in Amazon S3, while smaller data elements or file pointers (possibly to Amazon S3 objects) are best saved in Amazon SimpleDB. Because of the close integration between services and the free data transfer within the AWS environment, developers can easily take advantage of both the speed and querying capabilities of Amazon SimpleDB as well as the low cost of storing data in Amazon S3, by integrating both services into their applications.

For the Beta release, a single Amazon SimpleDB domain may grow to 10 GB and you are initially allocated a maximum of 100 domains; however, over time these allocations may be raised. Please complete this form if you require additional domains.

でかいデータはS3、ポインタ情報の様な小さいデータはSimpleDB。

Filed under  //   AWS   S3   SimpleDB  

Comments [0]

SimpleDB:Python SimplDB/dev のテスト

(jail)hdknr@mailjail:~/.ve/jail/src$ svn checkout http://simpledb-dev.googlecode.com/svn/trunk/ simpledb-dev
A    simpledb-dev/simpledb-dev
A    simpledb-dev/simpledb-dev/src
A    simpledb-dev/simpledb-dev/src/simpledb_dev.py
A    simpledb-dev/simpledb-dev/src/portalocker.py
A    simpledb-dev/simpledb-dev/src/templates
A    simpledb-dev/simpledb-dev/src/templates/Query.xml
A    simpledb-dev/simpledb-dev/src/templates/GetAttributes.xml
A    simpledb-dev/simpledb-dev/src/templates/ListDomains.xml
A    simpledb-dev/simpledb-dev/src/templates/QueryWithAttributes.xml
A    simpledb-dev/simpledb-dev/src/templates/DeleteAttributes.xml
A    simpledb-dev/simpledb-dev/src/templates/error.xml
A    simpledb-dev/simpledb-dev/src/templates/DeleteDomain.xml
A    simpledb-dev/simpledb-dev/src/templates/CreateDomain.xml
A    simpledb-dev/simpledb-dev/src/templates/PutAttributes.xml

(jail)hdknr@mailjail:~/.ve/jail/src$ pip install web.pyRequirement already satisfied: web.py in /usr/lib/pymodules/python2.5
Installing collected packages: web.py
Successfully installed web.py
(jail)hdknr@mailjail:~/.ve/jail/src$ dpkg -l | grep webpy
ii  python-webpy                      1:0.32+dak1-1              Web framework for Python applications

まぁ、いいか。

(jail)hdknr@mailjail:~/.ve/jail/src/simpledb-dev/simpledb-dev/src$ pwd
/home/hdknr/.ve/jail/src/simpledb-dev/simpledb-dev/src

(jail)hdknr@mailjail:~/.ve/jail/src/simpledb-dev/simpledb-dev/src$ python simpledb_dev.pyhttp://0.0.0.0:8080/


hdknr@mailjail:~/.ve/jail/src$ curl http://localhost:8080/<?xml version="1.0"?>
<Response xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">;
        <Errors>
                <Error>
                        <Code>NoSuchVersion</Code>
                        <Message>SimpleDB/dev only supports version 2007-11-07 currently</Message>
                        <BoxUsage>0.0000219907</BoxUsage>
                </Error>
        </Errors>
        <RequestID>5ba318a0-001f-4df0-9542-886cbf6cd705</RequestID>
</Response>

(jail)hdknr@mailjail:~/.ve/jail/src/simpledb-dev/simpledb-dev/src$ python simpledb_dev.py test > /tmp/simpledb_dev.log

simpledb_dev.log の確認。

Running tests and printing out sample XML output...

Sample GetAttributes:

?AWSAccessKeyId=Test&DomainName=TestDomain&Timestamp=XXX&Version=2007-11-07&Signature=XXX&Action=GetAttributes&ItemName=0385333498

<?xml version="1.0"?>
<GetAttributesResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">;
        <GetAttributesResult>
                        <Attribute><Name>Rating</Name><Value>5 stars</Value></Attribute>
                        <Attribute><Name>Rating</Name><Value>*****</Value></Attribute>
                        <Attribute><Name>Rating</Name><Value>Excellent</Value></Attribute>
                        <Attribute><Name>Keyword</Name><Value>Book</Value></Attribute>
                        <Attribute><Name>Keyword</Name><Value>Paperback</Value></Attribute>
                        <Attribute><Name>Title</Name><Value>The Sirens of Titan</Value></Attribute>
                        <Attribute><Name>Author</Name><Value>Kurt Vonnegut</Value></Attribute>
                        <Attribute><Name>Year</Name><Value>1959</Value></Attribute>
                        <Attribute><Name>Pages</Name><Value>00336</Value></Attribute>
        </GetAttributesResult>
        <ResponseMetadata>
                <RequestId>3175e02f-a69f-4e88-ad98-a22ceb6d8a9f</RequestId>
                <BoxUsage>0.0000219907</BoxUsage>
        </ResponseMetadata>
</GetAttributesResponse>

Sample Query:

?AWSAccessKeyId=Test&DomainName=TestDomain&Timestamp=XXX&QueryExpression=%5B%27Year%27+%3D+%272007%27%5D+intersection+%5B%27Author%27+starts-with+%27%27%5D+sort+%27Author%27+desc&Version=2007-11-07&Signature=XXX&Action=Query

<?xml version="1.0"?>
<QueryResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">;
<QueryResult>
    <ItemName>B00005JPLW</ItemName>
    <ItemName>B000T9886K</ItemName>
</QueryResult>
<ResponseMetadata>
        <RequestId>4f4bcb4e-56cb-43a5-a9aa-7a5da26ca46e</RequestId>
        <BoxUsage>0.0000219907</BoxUsage>
</ResponseMetadata>
</QueryResponse>

Sample QueryWithAttributes:

?AWSAccessKeyId=Test&DomainName=TestDomain&Timestamp=XXX&QueryExpression=%5B%27Title%27+%3D+%27The+Right+Stuff%27%5D&Version=2007-11-07&Signature=XXX&Action=QueryWithAttributes

<?xml version="1.0"?>
<QueryWithAttributesResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">;
<QueryWithAttributesResult>
    <Item>
            <Name>1579124585</Name>
            <Attribute><Name>Rating</Name><Value>4 stars</Value></Attribute>
            <Attribute><Name>Rating</Name><Value>****</Value></Attribute>
            <Attribute><Name>Keyword</Name><Value>Hardcover</Value></Attribute>
            <Attribute><Name>Keyword</Name><Value>Book</Value></Attribute>
            <Attribute><Name>Keyword</Name><Value>American</Value></Attribute>
            <Attribute><Name>Title</Name><Value>The Right Stuff</Value></Attribute>
            <Attribute><Name>Author</Name><Value>Tom Wolfe</Value></Attribute>
            <Attribute><Name>Year</Name><Value>1979</Value></Attribute>
            <Attribute><Name>Pages</Name><Value>00304</Value></Attribute>
    </Item>
</QueryWithAttributesResult>
<ResponseMetadata>
        <RequestId>fc29f8ef-6298-4712-8f61-16ccc9e48c73</RequestId>
        <BoxUsage>0.0000219907</BoxUsage>
</ResponseMetadata>
</QueryWithAttributesResponse>

Sample PutAttributes:

?AWSAccessKeyId=Test&DomainName=TestDomain&Timestamp=XXX&Attribute.0.Name=Rating&Version=2007-11-07&Signature=XXX&Action=PutAttributes&Attribute.0.Value=%2A%2A%2A%2A%2A&Attribute.0.Replace=true&ItemName=B00005JPLW

<?xml version="1.0"?>
<PutAttributesResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">;
        <ResponseMetadata>
                <RequestId>2131099b-8f38-4b14-a803-f9bd09c26fce</RequestId>
                <BoxUsage>0.0000219907</BoxUsage>
        </ResponseMetadata>
</PutAttributesResponse>

Sample Query:

?AWSAccessKeyId=Test&DomainName=TestDomain&Timestamp=XXX&QueryExpression=%5B%27Pages%27+%3C+%2700320%27%5D&Version=2007-11-07&Signature=XXX&Action=Query

<?xml version="1.0"?>
<QueryResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">;
<QueryResult>
    <ItemName>0802131786</ItemName>
</QueryResult>
<ResponseMetadata>
        <RequestId>829b9b6e-c91e-47e8-8e27-58c59084136c</RequestId>
        <BoxUsage>0.0000219907</BoxUsage>
</ResponseMetadata>
</QueryResponse>

Sample CreateDomain:

?AWSAccessKeyId=Test&DomainName=TestDomainXXX&Timestamp=XXX&Version=2007-11-07&Signature=XXX&Action=CreateDomain

<?xml version="1.0"?>
<CreateDomainResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">;
        <ResponseMetadata>
                <RequestId>2e4e3435-a629-4f5b-9fb1-d752680567f7</RequestId>
                <BoxUsage>0.0000219907</BoxUsage>
        </ResponseMetadata>
</CreateDomainResponse>

Sample ListDomains:

?AWSAccessKeyId=Test&DomainName=TestDomain&Timestamp=XXX&Version=2007-11-07&Signature=XXX&Action=ListDomains

<?xml version="1.0"?>
<ListDomainsResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">;
<ListDomainsResult>
    <DomainName>TestDomain</DomainName>
    <DomainName>TestDomainXXX</DomainName>
 </ListDomainsResult>
<ResponseMetadata>
        <RequestId>8d429dd8-adb5-4224-a632-eae03e40b20b</RequestId>
        <BoxUsage>0.0000219907</BoxUsage>
</ResponseMetadata>
</ListDomainsResponse>

Sample DeleteDomain:

?AWSAccessKeyId=Test&DomainName=TestDomainXXX&Timestamp=XXX&Version=2007-11-07&Signature=XXX&Action=DeleteDomain

<?xml version="1.0"?>
<DeleteDomainResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">;
        <ResponseMetadata>
                <RequestId>9ebcae0b-4a2e-4441-a3a1-d856d8b2f774</RequestId>
                <BoxUsage>0.0000219907</BoxUsage>
        </ResponseMetadata>
</DeleteDomainResponse>


OK

Filed under  //   AWS   Python   SimpleDB  

Comments [0]

MOONGIFT: » Python製開発用SimpleDB「SimpleDB/dev」:オープンソースを毎日紹介

今回紹介するオープンソース・ソフトウェアはSimpleDB/dev、Python製のSimpleDBクローンだ。

SimpleDB/devはAmazon Webサービスの一つ、SimpleDBをローカルでも動作させられるものだ。SimpleDBはスキーマ情報を持たないデータベースで、簡単にデータの登録および取得ができる。

Picture 442.png
テストスクリプトを実行したところ。XMLデータが返ってくる

 

SimpleDB/devはデフォルトでポート番号8080で立ち上がる。サービスが立ち上がったら、開発用アドレスとしてlocalhostを設定しておき、開発を行えば良い。SimpleDB/devはSimpleDBの置き換えを目指すものではないので、開発用として考えよう。

仕様としては2007年11月07日版REST APIの機能をサポートしている。アクションは全てをサポートしており、HTTPレスポンスも同じものになるように作られている。なお、逆にない機能としてはSOAP APIへの対応、認証、タイムスタンプ形式のチェック、HTTPSとなっている。

同じような機能を持ったライブラリは他にも存在する。だがAPIとの接続形式は変わらないので実装言語に依らず、自由に選択ができるのが魅力だ。Rubyの開発でも、PHPの開発でもクライアントライブラリさえあれば容易に使えるだろう。SimpleDBを使った開発を行われる方は要チェックだ。

Filed under  //   Aamazon   AWS   SimpleDB  

Comments [0]

[クラウド フォーラム]AWSは安全?日本での展開予定は?---Q&Aより - ニュース:ITpro

AWSのアジアでの展開予定は?

 時期,場所は明らかにできないが,早期にアジア地域にAWS用のデータセンターを建設したいと考えている。ただ,現在でも日本のユーザーが,米国や欧州のデータセンターにアクセスしてAWSを利用している例はたくさんある。バッチ処理のような,ネットワークの遅延が問題にならない利用もある。

Filed under  //   AWS   Cloud   Security  

Comments [0]

pythonでAWS. SimpleDBを触ってみる(2). - lolloo-htnの日記

SimpleDBでは, 作業はdomain(シートに相当)上で行うため, これらを作成したり,削除したり,既に作ったdomainに接続する必要がある. これらの操作を実現するbotoの関数を触ってみた. 解説は特にいらないほどシンプルなものである.

Filed under  //   AWS   Python   SimpleDB  

Comments [0]

pythonでAWS. SQSを触ってみる - lolloo-htnの日記

クラウドの中にぽつんとキューがある感じで非常に使いやすい. EC2と連携させるという観点以外にも, 手持ちのプログラムの並列処理の際にちゃちゃっと使うなど, 覚えておいて損はなさそうだ.

Filed under  //   AWS   Python   SQS  

Comments [0]