Sunday, January 6, 2019

Google Cloud Concepts Part II

This blog is going to be a continuation of below article so if you have not seen that please read that and come back here. https://annmaryjoseph.blogspot.com/2018/12/google-cloud-platform-concepts.html 

Here I am talking about,

Google Cloud Storage Basics


Google Cloud Storage implies virtually unlimited data storage and access globally, and it's available 24/7.

Why use Cloud Storage? Well, you could use it to, for example, serve website content, provide an archive for disaster recovery, as well as for direct downloads of large data objects.
 
Google offers a number of Cloud Storage options, and which one you use depends on the application and the workload profile.

Cloud Storage options support structured, unstructured, transactional, as well as relational data types.

Cloud Storage solutions are available for a wide array of use cases, including solutions for mobile applications, for hosting commercial software, for providing data pipelines, as well as basic backup storage.

Google Compute Engine is an Infrastructure-as-a-Service offering for provisioning flexible and self-managed VMs hosted on Google's infrastructure.

Google Compute Engine (GCE) is the Infrastructure as a Service (IaaS) component of Google Cloud Platform which is built on the global infrastructure that runs Google's search engine, Gmail, YouTube and other services. Google Compute Engine enables users to launch virtual machines (VMs) on demand.

VMs can be launched from the standard images or custom images created by users. GCE users must authenticate based on OAuth 2.0 before launching the VMs. Google Compute Engine can be accessed via the Developer Console, RESTful API or command-line interface (CLI).


Supported virtual machine operating systems include a number of Windows-based operating systems as well as many of the common Linux distributions. Compute Engine integrates with other Google Cloud Platform services like Cloud Storage, App Engine, and BigQuery.This extends the service to address more complex application requirements.



Google Compute Engine includes predefined machine configurations as well as the ability to define custom machine types that you could optimize for your organization's requirements. With Compute Engine, you can run large compute and batch jobs using preemptible VMs, which are very inexpensive and short-lived compute instances. Fixed-pricing with no contracts or commitments simplify the provisioning and shutdown of virtual machines. All data that's written to persistent disk using Compute Engine is encrypted transparently and then transmitted and stored in encrypted form. And Google Compute Engine also complies with a number of data and security certifications. Compute Engine offers local solid-state drive block storage that is always encrypted. Local solid-state drives are physically attached to the hosting server supporting very high IOPS and low latency. With Google Compute Engine instances, maintenance is transparent.


Google data centers use their live migration technology, providing proactive infrastructure maintenance. This improves reliability and security. Live virtual machines are automatically moved to nearby hosts, even while under extreme load, so that underlying host machines can undergo maintenance. This means no need to reboot your virtual machines due to host software updates, or even in the event of some types of hardware failure. Compute Engine instances are highly scalable. And global load balancing technology enables the distribution of incoming requests across pools of Compute Engine instances across multiple regions. The high performance Compute Engine virtual machines boot quickly and have persistent disk storage. In addition, Compute Engine instances use Google's private global fiber network with data centers located worldwide.

CPUs on Demand with Google Compute Engine

Google Compute Engine has predefined machine types from micro instances to instances with 96 virtual CPUs and 624 GB of memory. The tier of available configurations include Standard, High memory, and High CPU. With Compute Engine, you can create virtual machines with the configuration that's customized for your particular workloads. Again, from 1 to 64 virtual CPUs and up to 6.5 GB of memory available for each CPU.


This flexibility means that you can potentially save money, because you don't have to overcommit to a hard-sized, basically oversized, virtual machine. No, instead, you can customize to optimize your virtual CPUs and memory. By offering custom machine types and predefined machine types, this flexibility means that you can create infrastructure that is customized for your organization's particular workload requirements.

Compute Engine is very scalable and there is no commitment, meaning that you aren't locked in to whatever configuration you initially choose. So, for example, Compute Engine offers a start/stop feature that allows you to move your workload to smaller or larger custom machine type instances or to a predefined machine type

 





Cloud Storage


Google Cloud Storage offers a number of different storage and database options. Which you choose depends on the application, the data type, and the workload profile. Types of data supported include structured, unstructured, transactional, as well as relational data. So let's have a look at the different options.


First, Persistent Disk. This is fully managed block storage suitable for virtual machines and containers. So for Compute Engine and Kubernetes Engines instances. It's also recommended for snapshots for data backup. Typical workloads include disks for virtual machines, read only data sharing across multiple virtual machines, as well as quick and durable backups of virtual machines.


Google Cloud Storage is a scalable, fully managed, reliable, and cost efficient object and blob store. It's recommended for images, pictures, videos, objects, and blobs, as well as unstructured data. Common workloads include streaming and storage of multimedia, as well as custom data analytics pipelines, and archives, backups, and disaster recovery.

Next is Cloud Bigtable. Cloud Bigtable is a scalable, fully managed, NoSQL wide column database. And this suitable for both real time access and analytics workloads. Cloud Bigtable is indicated for low-latency read/write access, high-throughput of analytics as well as native time series support. Typical workloads include Internet of Things, streaming data, finance, adtech, which is marketing, as well as monitoring, geospatial datasets and graphs, and personalization.

Cloud Datastore is a fully managed NoSQL document database for web and mobile applications. It's recommended for semi-structured application data, as well as hierarchical data, and durable key-value data. Typical workloads include user profiles, product catalogs, for e-commerce, and game state for example.
 Google Cloud SQL is a fully managed MySQL, and PostgreSQL, database service. And it's built on the strength and reliability of Google's infrastructure, of course. It's recommended for web frameworks, as well as structured data, and online transaction processing workloads. Common workloads include web sites and blogs, content management systems and business intelligence applications, ERP, CRM, and e-commerce applications, as well as geospatial applications.


Google Cloud Spanner is a mission critical relational database service with strong transactional consistency at global scale. It's recommended for mission critical applications, as well as high throughput transactions, and for scale and consistency requirements. Typical work loads include adtech and financial services as well as global supply chain and retail.

Google BigQuery is a scalable, fully managed enterprise data warehouse with SQL and very fast response times. Consequently it's recommended for OLAP workloads, big data, examination, investigation processing, as well as reporting using business intelligence tools. Typical workloads include large data analytical reporting, data science, and advanced analyses, as well as processing big data using SQL.

And Google Drive. This is a collaborative space for storing, sharing and editing files, including Google Docs. It's recommended for interaction with docs and files by end users, collaboration, as well as file synchronization between cloud and local devices. Typical workloads include global file access via web, apps, and sync clients, coworker collaboration on documents, as well as backing up photos and other forms of media.


Google Cloud Storage options include a separate line for mobile. And, for example, we have cloud storage for Firebase, which is a mobile and web access service to could storage with serverless third-party authentication and authorization. Firebase Realtime Database is a realtime NoSQL JSON database for web and mobile applications. Firebase Hosting is production ready web and mobile content hosting for developers. And Cloud Firestore for Firebase is a NoSQL document database. Cloud Firestore for Firebase serves to simplify storing, querying, and syncing data for web apps and mobile apps, especially at global scale.


Google Shell

Google Cloud Shell is a shell environment for the Google Cloud platform. You can use it to manage projects and resources. And there's no need to install the Google Cloud SDK, it comes pre-installed. It can also be used with the gcloud command-line tool and any other utilities.



The Cloud Shell is a temporary Compute Engine VM instance. When you activate Cloud Shell, it provisions a g1-small Compute Engine instance on a per-user, per-session basis. The environment persists while the session is active. It also comes with a code editor, which is a beta version based on Orion, the open source development platform. Can also use Cloud Shell to browse file directories, view, and edit files.

Cloud Shell features include command line access. So you access the virtual machine instance in a terminal window. And it supports opening multiple shell connections to the same instance. That way you can work in one to, say, start a web server, and then work in another shell to perform some other operations. You can also use Cloud Shell to launch tutorials, open the code editor, and download files, or upload files.


It comes with 5 gigs of persistent disk storage and it's mounted as the $HOME directory on that virtual machine instance. And this is on a per-user basis across multiple projects. With respect to authorization, it's built in for access to projects and resources that are hosted on Cloud Platform because you are already logged in to Cloud Platform, it uses those credentials.

So there's no need for additional authorization. So you have access to platform resources, but the thing is, what access you have to those resources depends on the role that's been assigned to the Google Cloud platform user that you've logged in with. Cloud Shell also comes pre-installed with language support for Java, Go, Python, Node.js, PHP, Ruby, as well as .NET. Cloud Shell comes with a number of pre-installed tools. So for example, we have Linux shell interpreters and utilities like bash and sh for your shell interpreters. And it comes with the standard Debian system utilities. For text editors you're got emacs, vim and nano.


For source control, Cloud Shell supports Git and Mercurial. For Google SDKs and tools, you've got the Google App Engine SDK, the Google Cloud SDK, as well as gsutil for cloud storage. It also comes pre-installed with some build and package tools. For example, Gradle, Make, and Maven, as well as npm, nvm, and pip. Additional tools include the gRPC compiler, the MySQL client, Docker, and IPython. Cloud Shell also provides a web preview function that allows you to run web apps on the virtual machine instance and preview them from the Cloud console. The web applications must listen for HTTP requests on ports within a range of 8080 to 8084. And those ports are only available to the secure Cloud Shell proxy service, and that restricts access over HTTPS to your user account only.

To activate Google Cloud Shell, you just click on the activate Google Cloud Shell icon in the toolbar. And that opens underneath the current window, or within the current window that is.








Now, to list your config defaults, you just type gcloud config list



gcloud compute instances delete deletes one or more Google Compute Engine virtual machine instances.


Now I'll just type clear and press Enter. Now to set any defaults, you use gcloud config set project


 So now let's say that we want to set up my default compute zone. So gcloud config set compute/zone. And we'll say that I want to set it up as europe-west2







And you can verify it using gcloud config list


Okay, so now let's clear this by typing C-L-E-A-R and pressing Enter. Now, you can also see what components that come pre-installed. And you can also see the current Cloud SDK version you're using as well as the latest available version by typing gcloud components list.





 I am in latest version but if you like to update the gcloud components update
Using Sudo command


If you have update Press Y and this may take some time

Shell environment again supports the standard Debian system utilities, well that means that we can, for example, create an alias for a command.
So let's type alias c=clear. So now if I type c, that's like issuing a clear command.

That also means that you can configure variables and so on by placing those directly in the .profile file. In the tilde, or dollar sign home directory, and that's where we're at right now. So if I type pwd, this is my home directory, if I type echo $HOME, again same thing. 

So if I go tail to list the last ten lines of .profile, you'll see here that I have in my profile setup some environment variables. 

And I've also set up an alias command here. So in actuality any time that I launch the Shell I will have that alias set up on this machine, those environment variables will also be populated with the values that you see there. Okay, so that's Cloud Shell.
  
Setting Shell and Environmental Variables
Environmental variables are variables that are defined for the current shell and are inherited by any child shells or processes. Environmental variables are used to pass information into processes that are spawned from the shell.
Shell variables are variables that are contained exclusively within the shell in which they were set or defined. They are often used to keep track of ephemeral data, like the current working directory.
We can see a list of all of our environmental variables by using the env or printenv commands.

The set command can be used to see the shell variables This is usually a huge list. You probably want to pipe it into a pager program to deal with the amount of output easily:
set | less






We will begin by defining a shell variable within our current session. This is easy to accomplish; we only need to specify a name and a value. We'll adhere to the convention of keeping all caps for the variable name, and set it to a simple string.








Creating a Cloud Storage Bucket in Cloud Shell


Now we'll create a cloud storage bucket using the gsutil tool in Google Cloud Shell.

Gsutil is the cloud storage command line tool

Gsutil is a Python application that allows you to access cloud storage from the command line. You can use Gsutil to do a number of things with cloud storage buckets. And also perform object to management tasks like creating and deleting buckets, uploading, downloading and deleting objects. Listing buckets and objects moving, copying and renaming objects, as well as editing object and bucket access control lists. So before we get started, I just want to show that I've logged into my Google Cloud platform here, and I've already activated Cloud Shell.

You need python 2.7 for this so please verify. Now, this is already installed in Google Cloud Shell. But if you're using Cloud SDK on your local machine you'll need to install Python 2.7. So to see if you have gsutil tool installed, type gcloud components list.


Command to create bucket is

gsutil mb gs://mybucket


And you can see service exception 409 bucket my bucket already exists. See the message it displays .
And obviously, somebody else across Cloud storage has already used that. And that's the thing, the name of your bucket has to be unique across cloud storage. So this one might work, however. Let's try this one, my bucket, 1258


gsutil mb gs://mybucket1258










Next I am creating some fake images and then copying the same to new bucket I created

touch london.png
touch delhi.png
touch apple.jpeg
ls








Command to copy images to bucket I just created is do one at a time

gsutil cp london.png gs://mybucket1258
gsutil cp delhi.png gs://mybucket1258
gsutil cp apple.jpeg gs://mybucket1258










Now go to GCP - Storage -Browser - Refresh Bucket and the images are there


 Okay, now, it's always a good idea to clean up after yourself, right? You don't want to incur any charges if you don't have to. So to do so what we can do is, we can use this command, right?

Gsutil rb for remove bucket or gsutil rm -r to remove the bucket and anything underneath it. So let's copy these commands.  In order to use rb for remove bucket, it must be empty. However, this command, gsutil rm -r will remove all the contents and the bucket. 

gsutil rm gs://mybucket1258/*
gsutil rb gs://mybucket1258/

First command will delete the contents in bucket
Second Command deletes the bucket
and we'll see now that we've got no buckets. So we successfully cleaned up after ourselves.


Data Analysis with GCP

Modern applications typically generate large amounts of data from many different sources. Devices or sensors can capture raw data in unprecedented volumes, and this data is perfect for analysis. This analysis can provide insight into an organization's operating environment and business.


 BigQuery is a serverless (In the context of a data warehouse, serverless means being able to simply store and query your data, without having to purchase, rent, provision or manage any infrastructure or software licensing.) highly scalable enterprise data warehouse on Google Cloud Platform. It's fully managed, so no infrastructure to manage, eliminating the need for a database administrator. So an organization can focus on analyzing the data using SQL. It enables real-time capture and analysis of data by creating a logical data warehouse. And with BigQuery, you can set up your data warehouse in minutes and start querying against huge amounts of data.
 Cloud Dataflow is a unified programming model and managed service from Google Cloud Platform using Apache Beam SDK. Apache Beam SDK supports powerful operations that resemble MapReduce operations, powerful data windowing and verifying correctness control against streaming or batch data. It's used for a wide range of data processing patterns, including against streaming and batch computations and analytics as well as extract, transform, load operations. [or ETL.] Since it's fully managed, it lets you focus on operational tasks like capacity planning, resource management, and performance optimization.

Cloud Dataproc is a fully managed and fast cloud service that is used to run Apache Spark and Hadoop clusters. You can take advantage of Apache big data ecosystem using tools, libraries, and documentation for Spark, Hive, Hadoop, and even Pig. Features include the fact that it's fully managed, and that means that [automated cluster management.] you've got a managed deployment, logging, as well as monitoring. This means an organization can focus on the data not the cluster, and clusters are scalable, speedy, and robust.

With Cloud Dataproc, clusters can also utilize Google Cloud Platform's flexible virtual machine infrastructure, including custom machine types, as well as preemptible virtual machines. So this provides for perfect sizing and scaling. Since it's on Google Cloud Platform, Cloud Dataproc integrates with several other Cloud Platform services like Cloud Storage, BigQuery, BigTable, Stackdriver Logging and Monitoring. And this results in a comprehensive powerful data platform. Versioning lets you switch versions of big data tools like Spark, Hadoop, as well as others. And you can operate clusters with multiple master nodes and set jobs to restart on failure, ensuring high availability.



Google's Cloud SQL is a fully managed database service that simplifies the creation, maintenance, management, and administration of relational databases on Cloud Platform. Cloud SQL works with MySQL or PostgreSQL databases. Cloud SQL offers fully managed MySQL Community Edition database instances in the cloud, and these are available in the US, EU, or Asia. Cloud SQL supports both first and second generation MySQL instances. And data encryption is provided on Google's internal networks, as well as for databases, temporary files, and backups. You can use the Cloud Platform Console to create and manage instances. The service also supports instance cloning.


Cloud SQL for MySQL also supports secure external connections using Cloud SQL Proxy or SSL. It also provides automatic failover for data replication between multiple zones. And you can import and export databases either with mysqldump or you can import and export CSV files. Cloud SQL for MySQL offers on-demand back ups and automated point-in-time recovery, as well as Stackdriver integration for logging and monitoring.



Cloud SQL for PostgreSQL is a Beta service from Google supporting PostgreSQL 9.6 database instances in the cloud, and these, again, are available in the US, EU, or Asia. It supports custom machine types and up to 416 gigs of RAM and 32 CPUs as well as up to 10 terabytes of storage, and that can also be increased if required. Similar to Cloud SQL for MySQL, you can use Cloud Platform Console to create and manage instances. And data encryption is provided on Google's internal networks, and for databases, temporary files, and backups.

Cloud SQL for PostgreSQL supports secure external connections using the Cloud SQL Proxy or SSL. You can use SQL dump to import and export databases. And Cloud SQL offers PostgreSQL client-server protocol and standard PostgreSQL connectors support. Cloud SQL for PostgreSQL offers on-demand and automated backups, as well as Stackdriver integration for logging and monitoring.


Creating a MySQL Database with Cloud SQL

 I'm logged into my Google Cloud Platform account. I'm going to click on burger menu and I'm going to scroll down. And I'll click on SQL under Storage, so here I am on the Cloud SQL page, and I'm going to click Create Instance. [
Choose a database engine - MySQL Version 5.6 or 5.7 or PostgreSQL BETA Version 9.6.] I'm going to leave MySQL selected, and I'll click Next. [In the next page, there are two types of Cloud SQL MySQL instances. There is Learn more hyperlinked text and: MySQL Second Generation (Recommended) and MySQL First Generation (Legacy)] And I'm going to click Choose Second Generation, [In the Create a MySQL Second Generation instance page, there is an Instance ID field, Root password field, or a checkbox for No password, Location: Region, and Zone dropdowns and a Create button and an option to Show configuration options.] and now we'll enter the instance information. So we'll call this annmj. [

You can leave the rest of the configuration at the defaults. But you can also click to show more configuration options. And you can drill right down into labels, maintenance schedule, database flags, authorize networks, and so on. But we'll hide those configuration options, we don't need those. And we'll click on Create. Now it will return us to the Instances page, and you'll see now that it's initializing the instance. Okay, so our instance has now initialized and started. And we can connect to it using the MySQL client in Cloud Shell.

We have two choices. We can either start or activate a Cloud Shell manually. [# you can either manually activate Google Cloud Shell by clicking # the icon in the upper right corner of the 'tool bar' ] And optionally set some default configurations. [# OPTIONAL: to set any defaults use 'getcloud config set' gcloud config set project gcde-2019-197823 gcloud config set compute/zone us-central1-a] Or we can connect to it using the Instance details page. 

To do that, click on the instance, and on the instance details page, scroll down. And we'll click Connect using Cloud Shell. [This is under the Connect to this instance tile which displays the instance IPv4 address and the Instance connection name. CloudShell displays at the bottom of the screen.]


Okay, so the connection string was populated in Cloud Shell automatically. so, we'll just press Enter to connect as root. [The message displays: Whitelisting your IP for incoming connection for 5 minutes.] And this is where we're at, we're going to enter the root password. Now you just have to wait a couple of minutes until it's done white listing the IP for this connection. And you saw that notification saying done. So it made a modification to the instance, white listing the IP for this connection, and now we enter the password. So click into Shell, and enter the password that you used. And press Enter. [When he enters the password, a message displays:

Welcome to the MariaDB monitor. Commands end with ; or /g.] So now we're at the MySQL prompt. So, first what we'll do is we'll create a database called blog. So I'll highlight this command. 
CREATE DATABASE blog;






And we can see that it was successful. Now, we'll insert some sample blog data into our new database. So first, what we have to do, so we have to specify to use blog. Then we create the table blogs and it's got a couple of columns. Blogger name, content, and an entryID, and that's the primary key,

Use blog; CREATE TABLE blogs (bloggerName VARCHAR (255), content VARCHAR (255), entryID INT NOT NULL AUTO_INCREMENT, PRIMARY KEY (entryID));








INSERT INTO blogs (bloggerName, content) values ("Leicester", "this is MY first blog");
INSERT INTO blogs (bloggerName, content) values ("London", "this is MY first blog");
INSERT INTO blogs (bloggerName, content) values ("Scotland", "this is MY first blog");

Okay, so we should have some data. So let's type select * from blogs; don't forget the semi-colon. Press Enter, and there you see is our data. 

Now to exit you just type exit, press Enter, and it's always a good idea to clean up after yourself. So what I'm going to do is I'm going to close the Cloud Shell. I'm going to go back to our Instance. Now we'll scroll up to the top of the Instance details page. Click the ellipsis, and we'll select, Delete. Now we have to type the instance name. 















Creating a PostgreSQL Database with Cloud SQL



 Okay, our PostgreSQL instance has initialize and started. And now at this point, we can connect to it using the PSQL Client in Cloud Shell.  We have a couple of options. We can either manually activate a Google Cloud Shell, and we can optionally specify some configuration settings. Or we can connect directly to it by clicking Connect using Cloud Shell from the Instance details page. This is the easiest way, so let's go ahead and do this. 

When you see your instance name and mypsqlinstance --user=postgres --quiet click enter and enter the password.














CREATE DATABASE blog;
\connect blog;  - Enter password

CREATE TABLE blogs (bloggerName VARCHAR (255), content VARCHAR (255), entryID SERIAL PRIMARY KEY) ;

\d+ table_name to find the information on columns of a table.

\d+ blogs






INSERT INTO blogs (bloggername, content) values ('Joe', 'this is my first blog');
INSERT INTO blogs (bloggername, content) values ('Wade', 'this is my first blog');
INSERT INTO blogs (bloggername, content) values ('Bill', 'this is my first blog');
select * from blogs;

Now to quit from the PSQL prompt, just press \q, press Enter.









I was new to PostgreSQL so visited below site for basics.