python - Propagate pandas series metadata through joins -


I want to be able to attach metadata to a series of dataframes (specifically, the original filename), so that After two dataframes, I can see metadata from where we came from each series.

The current _metadata attribute () about the _metadata (,), but nothing in panda docs.

So far, I can allow the protection of metadata estimated in the _metadata property, but after joining AttributeError

  df1 = pd.DataFrame (np.random.randint (0, 4, (6, 3)) df2 = pd.DataFrame (np.random (0, 4, (6, 3) df1._metadata.append ('filename') df1 [df1.column [0]] ._metadata.append ('filename') c in df1: df1 [c]] .filename = 'fname1.csv' df2 [ c] .filename = 'fname2.csv' df1 [0] ._ metadata # ['name', 'filename'] df1 [0] .filename # fname1.csv df2 [0] .filename # fname2.csv df1 [0 ] [: 3]. Filename # fname1.csv mgd = pd.merge (df1, df2, on = [0]) mgd ['1_x'] ._ metadata # ['name', 'filename'] mgd ['1_x ']. filename # increases enter   

any way to preserve it?

Update: Epilogue

As discussed, can not keep track of the __ final form___ Dataframe is a member, only independent series, so for now I will keep track of series-level metadata by retaining metadata associated with dataframe. My code looks like this:

  def cust_merge (d1, d2): "custom merge functions for 2 drafts" ... def finalize_df (self, other, method = none, ** kwargs): To name in self._metadata: if the method == 'merge': lmeta = getattr (other.left, name, {}) rmeta = getattr (other.right, name, {}) newmeta = cust_merge (lmeta , rmeta) object .__ setattr __ (self, name, newmeta) Other: object .__ setattr __ (self, name, getattr (other, name, none) returns self df1.filenames = {c: 'fname1.csv 'D in df1} df2.filenames = {c:' fname2.csv 'c in df2} pd.DataFrame._metadata = [' filename '] pd.DataFrame .__ final form__ = A I think that such a thing will work (and if not, then pls do it) In this way a bug reports, while the support is slightly bleeding edge, it is possible that it is not called all the time in the way it is included, it is not a little unfair.)  < P> See it for a more detailed example / bug fix  
  DataFrame._metadata = ['name', 'file name'] def __finalize __ (self, other, method = none, ** kwargs): "" "from self parameter to second Promote Metadata ---------- Other: The object that we are promoting the method: Optional, a method passed; Maybe different types of promotional works based on this "### You need mediation to take, when there is a struggle for the name itself. _Medata: Object .__ Set TR __ (self, name, gate (other, name, none) returns self dataframe .__ final form__ = __finalize__   

Then this is the default final form for the dataframe with your custom one Where I have indicated, you have to put some code that can mediate between conflicts, because it is not done by default, like the name of the name 'Foo' and 'frame 2' Bar 'is the name, when the method is __ add __ So what do you do? You do this and how it works.

This is only taking place for dataframe (and you can only do the default job if you want), which Propagate yourself to others; Under special cases of the law.

This means sub-sections to be mentioned above, so you are here Monkey Patching (instead of sub-CL) Eshing which is mostly overkill over time).

Comments

Popular posts from this blog

c# - ASP.NET MVC - Attaching an entity of type 'MODELNAME' failed because another entity of the same type already has the same primary key value -

jasper reports - How to center align barcode using jasperreports and barcode4j -

django - CommandError: You must set settings.ALLOWED_HOSTS if DEBUG is False -