hdknr’s posterous

 
Filed under

s3

 

AmazonS3 - Hadoop Wiki

Hadoop provides two filesystems that use S3.

S3 Native FileSystem (URI scheme: s3n)
A native filesystem for reading and writing regular files on S3. The advantage of this filesystem is that you can access files on S3 that were written with other tools. Conversely, other tools can access files written using Hadoop. The disadvantage is the 5GB limit on file size imposed by S3. For this reason it is not suitable as a replacement for HDFS (which has support for very large files).
S3 Block FileSystem (URI scheme: s3)
A block-based filesystem backed by S3. Files are stored as blocks, just like they are in HDFS. This permits efficient implementation of renames. This filesystem requires you to dedicate a bucket for the filesystem - you should not use an existing bucket containing files, or write other files to the same bucket. The files stored by this filesystem can be larger than 5GB, but they are not interoperable with other S3 tools.

AWS Hadoop Filesystem(HDFS)
- ネイティブ・ファイルシステム( s3n:// )
- ブロック・ファイルシステム ( s3:// )

Filed under  //   AWS   Hadoop   S3  

Comments [0]

AWS: django-storages : install

(dev)hdknr@domU-12-31-39-00-D9-A1:~/.ve/dev/src$ hg clone
https://hdknr@bitbucket.org/david/django-storages/
destination directory: django-storages
requesting all changes
adding changesets
adding manifests
adding file changes
added 43 changesets with 105 changes to 51 files
updating working directory
32 files updated, 0 files merged, 0 files removed, 0 files unresolved
(dev)hdknr@domU-12-31-39-00-D9-A1:~/.ve/dev/src$ cd django-storages/
(dev)hdknr@domU-12-31-39-00-D9-A1:~/.ve/dev/src/django-storages$ ls -al
total 68
drwxr-xr-x 6 hdknr users 4096 2009-10-11 05:20 .
drwxr-xr-x 3 hdknr users 4096 2009-10-11 05:20 ..
-rw-r--r-- 1 hdknr users 562 2009-10-11 05:20 AUTHORS
drwxr-xr-x 2 hdknr users 4096 2009-10-11 05:20 backends
drwxr-xr-x 2 hdknr users 4096 2009-10-11 05:20 docs
drwxr-xr-x 4 hdknr users 4096 2009-10-11 05:20 examples
drwxr-xr-x 3 hdknr users 4096 2009-10-11 05:20 .hg
-rw-r--r-- 1 hdknr users 103 2009-10-11 05:20 .hgignore
-rw-r--r-- 1 hdknr users 1539 2009-10-11 05:20 LICENSE
-rw-r--r-- 1 hdknr users 315 2009-10-11 05:20 README
-rw-r--r-- 1 hdknr users 21229 2009-10-11 05:20 S3.py
-rw-r--r-- 1 hdknr users 988 2009-10-11 05:20 setup.py


(dev)hdknr@domU-12-31-39-00-D9-A1:~/.ve/dev/src/django-storages$ python
setup.py install
zip_safe flag not set; analyzing archive contents...

Installed
/home/hdknr/.ve/dev/src/django-storages/setuptools_hg-0.2-py2.5.egg
running install
running bdist_egg
running egg_info
creating django_storages.egg-info
writing django_storages.egg-info/PKG-INFO
writing top-level names to django_storages.egg-info/top_level.txt
writing dependency_links to django_storages.egg-info/dependency_links.txt
writing manifest file 'django_storages.egg-info/SOURCES.txt'
writing manifest file 'django_storages.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-i686/egg
running install_lib
running build_py
creating build
creating build/lib
copying S3.py -> build/lib
creating build/lib/backends
copying backends/__init__.py -> build/lib/backends
copying backends/s3boto.py -> build/lib/backends
copying backends/couchdb.py -> build/lib/backends
copying backends/symlinkorcopy.py -> build/lib/backends
copying backends/s3.py -> build/lib/backends
copying backends/database.py -> build/lib/backends
copying backends/mosso.py -> build/lib/backends
copying backends/overwrite.py -> build/lib/backends
copying backends/ftp.py -> build/lib/backends
copying backends/mogile.py -> build/lib/backends
copying backends/image.py -> build/lib/backends
creating build/bdist.linux-i686
creating build/bdist.linux-i686/egg
creating build/bdist.linux-i686/egg/backends
copying build/lib/backends/__init__.py ->
build/bdist.linux-i686/egg/backends
copying build/lib/backends/s3boto.py -> build/bdist.linux-i686/egg/backends
copying build/lib/backends/couchdb.py -> build/bdist.linux-i686/egg/backends
copying build/lib/backends/symlinkorcopy.py ->
build/bdist.linux-i686/egg/backends
copying build/lib/backends/s3.py -> build/bdist.linux-i686/egg/backends
copying build/lib/backends/database.py ->
build/bdist.linux-i686/egg/backends
copying build/lib/backends/mosso.py -> build/bdist.linux-i686/egg/backends
copying build/lib/backends/overwrite.py ->
build/bdist.linux-i686/egg/backends
copying build/lib/backends/ftp.py -> build/bdist.linux-i686/egg/backends
copying build/lib/backends/mogile.py -> build/bdist.linux-i686/egg/backends
copying build/lib/backends/image.py -> build/bdist.linux-i686/egg/backends
copying build/lib/S3.py -> build/bdist.linux-i686/egg
byte-compiling build/bdist.linux-i686/egg/backends/__init__.py to
__init__.pyc
byte-compiling build/bdist.linux-i686/egg/backends/s3boto.py to s3boto.pyc
byte-compiling build/bdist.linux-i686/egg/backends/couchdb.py to couchdb.pyc
byte-compiling build/bdist.linux-i686/egg/backends/symlinkorcopy.py to
symlinkorcopy.pyc
byte-compiling build/bdist.linux-i686/egg/backends/s3.py to s3.pyc
byte-compiling build/bdist.linux-i686/egg/backends/database.py to
database.pyc
byte-compiling build/bdist.linux-i686/egg/backends/mosso.py to mosso.pyc
byte-compiling build/bdist.linux-i686/egg/backends/overwrite.py to
overwrite.pyc
byte-compiling build/bdist.linux-i686/egg/backends/ftp.py to ftp.pyc
byte-compiling build/bdist.linux-i686/egg/backends/mogile.py to mogile.pyc
byte-compiling build/bdist.linux-i686/egg/backends/image.py to image.pyc
byte-compiling build/bdist.linux-i686/egg/S3.py to S3.pyc
creating build/bdist.linux-i686/egg/EGG-INFO
copying django_storages.egg-info/PKG-INFO ->
build/bdist.linux-i686/egg/EGG-INFO
copying django_storages.egg-info/SOURCES.txt ->
build/bdist.linux-i686/egg/EGG-INFO
copying django_storages.egg-info/dependency_links.txt ->
build/bdist.linux-i686/egg/EGG-INFO
copying django_storages.egg-info/not-zip-safe ->
build/bdist.linux-i686/egg/EGG-INFO
copying django_storages.egg-info/top_level.txt ->
build/bdist.linux-i686/egg/EGG-INFO
creating dist
creating 'dist/django_storages-1.0-py2.5.egg' and adding
'build/bdist.linux-i686/egg' to it
removing 'build/bdist.linux-i686/egg' (and everything under it)
Processing django_storages-1.0-py2.5.egg
creating
/home/hdknr/.ve/dev/lib/python2.5/site-packages/django_storages-1.0-py2.5.egg
Extracting django_storages-1.0-py2.5.egg to
/home/hdknr/.ve/dev/lib/python2.5/site-packages
Adding django-storages 1.0 to easy-install.pth file

Installed
/home/hdknr/.ve/dev/lib/python2.5/site-packages/django_storages-1.0-py2.5.egg
Processing dependencies for django-storages==1.0
Finished processing dependencies for django-storages==1.0

Filed under  //   AWS   Python   S3   SimpleDB  

Comments [0]

Amazon SimpleDB

Data Storage in Amazon SimpleDB vs. Data Storage in Amazon S3

Unlike Amazon S3, Amazon SimpleDB is not storing raw data. Rather, it takes your data as input and expands it to create indices across multiple dimensions, which enables you to quickly query that data. Additionally, Amazon S3 and Amazon SimpleDB use different types of physical storage. Amazon S3 uses dense storage drives that are optimized for storing larger objects inexpensively. Amazon SimpleDB stores smaller bits of data and uses less dense drives that are optimized for data access speed.

In order to optimize your costs across AWS services, large objects or files should be stored in Amazon S3, while smaller data elements or file pointers (possibly to Amazon S3 objects) are best saved in Amazon SimpleDB. Because of the close integration between services and the free data transfer within the AWS environment, developers can easily take advantage of both the speed and querying capabilities of Amazon SimpleDB as well as the low cost of storing data in Amazon S3, by integrating both services into their applications.

For the Beta release, a single Amazon SimpleDB domain may grow to 10 GB and you are initially allocated a maximum of 100 domains; however, over time these allocations may be raised. Please complete this form if you require additional domains.

でかいデータはS3、ポインタ情報の様な小さいデータはSimpleDB。

Filed under  //   AWS   S3   SimpleDB  

Comments [0]

pythonでAWS. CloudFront + S3 を使ってみた - lolloo-htnの日記

CloudFrontはS3に格納したデータを、クライアントに一番近いロケーションから発信するためのサービス。わかりやすい絵はココにあるようなもの。実例として以下をやってみた

Filed under  //   AWS   CloudFront   Python   S3  

Comments [1]

Amazon CloudFront

Check out this website I found at docs.amazonwebservices.com

1. 元サーバーのストレージ

通常のS3ストレージを利用し、バケットにオブジェクトを保存。S3の課金。

2. エッジにコピー

GETリクエストとデータ転送はS3課金。
エッジロケーション(クライアント/コンシューマ)から要求があったときのみCloudFrontはオブジェクトをコピーする。

3. エッジロケーションへオブジェクトをサービス

ここはCloudFront課金。S3で直接配信するより安い。

Filed under  //   AWS   CF   CloudFront   S#   S3  

Comments [0]

Amazon AWS: S3 のバケットにAMIをアップロード

domU-12-31-39-03-74-06:/mnt# vi uploadtos3.sh
 
#!/bin/sh
 
AKID=VKIVJC7L3ddHWR5W745A
SKEY=se8ybExlusxxhuksddxdxi6Hx3ddpIAJf4rd3wj13d
BKT=hdknr
ec2-upload-bundle --bucket $BKT --manifest image.manifest.xml

AMIの登録で /hdknr/image.manifest.xml を使います。

Filed under  //   Amazon   Ami   AWS   Aws Amazon   S3  

Comments [0]