45669

Using Python to Query GCP Stackdriver logs

<h3>Question</h3>

I am using Python3 to query Stackdriver for GCP logs. Unfortunately, the log entries that have important data are returned to me as "NoneType" instead of as a "dict" or a "str". The resulting "entry.payload" is type "None" and the "entry.payload_pb" has the data I want, but it is garbled.

Is there a way to get Stackdriver to return this data in a clean format, or is there a way I can parse it? If not, is there a way I should query this data that is better than what I am doing and yields clean data?

My code looks something like this:

#!/usr/bin/python3 from google.cloud.logging import Client, ASCENDING, DESCENDING from google.oauth2.service_account import Credentials projectName = 'my_project' myFilter = 'logName="projects/' + projectName + '/logs/compute.googleapis.com%2Factivity_log"' client = Client(project = projectName) entries = client.list_entries(order_by=DESCENDING, page_size = 500, filter_ = myFilter) for entry in entries: if isinstance(entry.payload, dict): print(entry.payload) if isinstance(entry.payload, str): print(entry.payload) if isinstance(entry.payload, None): print(entry.payload_pb)

The "entry.payload_pb" data always starts like this:

type_url: "type.googleapis.com/google.cloud.audit.AuditLog" value: "\032;\n9gcp-user@my-project.iam.gserviceaccount.com"I\n\r129.105.16.28\0228
<h3>Answer1:</h3>

It looks like something is broken in python library related to parsing protobuf for logging. I found two old issues

<ol><li>https://github.com/GoogleCloudPlatform/google-cloud-python/issues/3218</li> <li>https://github.com/GoogleCloudPlatform/google-cloud-python/issues/2674</li> </ol>

that seems to be resolved sometime ago - but I believe problem was reintroduced. I have ticket opened for google support on this issue and they are looking into it.

As workaround - you can use two options:

<ol><li>You can create export (sink) to BigQuery - so in this case you query your log easily - problem with this approach it does not export old data that you collect before creating export.</li> <li>

You can use gcloud command. Especially

gcloud logging read

</li> </ol>

It is very powerful (supports filters, timestamps) - but its output format is yaml. You can install and use PyYAML library to convert logs to dictionary.


<h3>Answer2:</h3>

The LogEntry.proto_payload is an Any message, which encodes some other proto buffer message. The type of proto message is indicated by type_url, and the body of the message is serialized into the value field. After identifying the type, you can de-serialize it with something like

from google.cloud.audit import AuditLog ... audit_log = AuditLog() audit_log.ParseFromString(entry.payload_pb.value)

The AuditLog message is available at https://github.com/googleapis/googleapis/blob/master/google/cloud/audit/audit_log.proto and the corresponding Python definitions can be built using the protoc compiler

Note that some fields of the AuditLog message can contain other Any messages too. There are more details at https://cloud.google.com/logging/docs/audit/api/


<h3>Answer3:</h3>

In case anyone has the same issue that I had, here's how I solved it: 1) Download and install protobuf. I did this on a mac with brew (brew install protobuf)
2) Download and install grpcio. I used pip install grpcio
3) Download the "Google APIs" to a known directory. I used /tmp, and this command git clone https://github.com/googleapis/googleapis
4) Change directories to the root directory of the repository you just downloaded in Step 3
5) Use protoc to build the python repository. This command worked for me
protoc -I=/tmp/googleapis/ --python_out=/tmp/ /tmp/googleapis/google/cloud/audit/audit_log.proto
6) Your audit_log_pb2.py file <em>should</em> exist in /tmp/audit_log_pb2.py
7) Place this file in the proper path OR in the same directory as your script.
8) Add this line to the imports in your script:
import audit_log_pb2
9) After I did this, the entry.payload portion of the Protobuf entry was consistently populated with dicts.

PLEASE NOTE: You should verify what version of protoc you are using with the following command protoc --version. You really want to use protoc 3.x, because the file we are building from is from version 3 of the spec. The Ubuntu package I installed on a Linux box was version 2, and this was kind of frustrating. Also, although this file was built for Python 2.x, it seems to work fine with Python 3.x.


<h3>Answer4:</h3>

Actually I missed that but you can disable gRPC and make the API return a dict (JSON) payload by setting the environment variable GOOGLE_CLOUD_DISABLE_GRPC to a non-empty string, e.g. GOOGLE_CLOUD_DISABLE_GRPC=true.

This will populate the payload instead of payload_pb - easier than compiling a proto buffer which may be out-of-date !


<h3>Answer5:</h3>

I followed @rhinestone-cowguy's answer, but think example usage will help people who find this answer. To use the compiled (proto) code:

from google.cloud import logging import audit_log_pb2 client = logging.Client() PROJECT_IDS = ["one-project", "another-project"] for entry in client.list_entries(projects=PROJECT_IDS): # API call(s) # The proto payload is an Any message. audit_log = audit_log_pb2.AuditLog() entry.payload.Unpack(audit_log) print(audit_log)

The use of the Any message is documented in Python Generated Code.

来源:https://stackoverflow.com/questions/50301632/using-python-to-query-gcp-stackdriver-logs

Recommend

  • Saving and working with a self-referencing entity in typeorm
  • Room.inMemoryDatabaseBuilder() not found in instrumented test
  • pass an array from jQuery to PHP (and actually go to the page after submit)
  • Merge Two images together on Server, then save
  • C# library for .NET Core and .NET Framework
  • Adjoining “f” and “l” characters
  • JPA EntityManager and JavaFx [duplicate]
  • How to get mouseover event from android emulator
  • How to implement NOT LIKE as the search condition for containstable(Full-Text Query)?
  • Best way for multi-language sites virtual Directories
  • How to wrap string in span before and after all newlines in PHP?
  • Are there possible approaches to map signal handling (c library) into a c++ exception?
  • lateral cell space tableview Swift
  • UITextField get focus and then lose focus immediately due to return YES in textFieldShouldReturn
  • Is there a modern ( e.g. CLR ) replacement for bison / yacc?
  • Adding a delete button in PHP on each row of a MySQL table
  • How to install or uninstall SonarQube plug-ins with HTTP?
  • Fetch data from nested nodes in Firebase
  • Ellipsis directive with title
  • CameraPreviewImageSource empty preview frame
  • Limit # of records returned based on a form control
  • Adding custom message on Thank You page by shipping method
  • Python tk scrollbar becomes inactive once text is outside the screen
  • ROR + MVC Disable Browser Cache
  • Expression.Call GroupBy then Select and Count()?
  • UIScrollView does not restore properly
  • Why do you need 2 Javascript files for cross-platform Cordova plugin?
  • Splitting ReportLab table across PDF page (side by side)?
  • Facebook Error (#200) The user hasn't authorized the application to perform this action (PHP)
  • “Cannot open log for source” - Windows 7 - .NET web service - event log
  • How to load dynamic images in custom ListView
  • What is the difference between dynamically creating a script tag and statically embed a script tag?
  • Django, uWSGI & nginx: Process dies for “no reason”
  • Unable to create Access token grant type in wso2 API manager store to test API
  • How to call different template for different category archive page in woocommerce
  • How to mutate multiple variables without repeating codes?
  • ReferenceError: TextEncoder is not defined