Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function to convert node tree back to HTML #33

Merged
merged 2 commits into from
Oct 7, 2015
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ This project adheres to [Semantic Versioning](http://semver.org/).
- Add the child combinator to `Floki.find/2`.
- Add the adjacent sibling combinator to `Floki.find/2`.
- Add the general adjacent sibling combinator to `Floki.find/2`.
- Add `Floki.raw_html/2`

## [0.4.1] - 2015-09-18

Expand Down
12 changes: 12 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,10 @@ Floki.find(html, "#content")
Floki.find(html, "p.headline")
# => [{"p", [{"class", "headline"}], ["Floki"]}]

Floki.find(html, "p.headline")
|> Floki.raw_html
# => <p class="headline">Floki</p>


Floki.find(html, "a")
# => [{"a", [{"href", "http://github.com/philss/floki"}], ["Github page"]},
Expand Down Expand Up @@ -129,6 +133,14 @@ Floki.find(html, ".example")
# => [{"div", [{"class", "example"}], []}]
```

To convert your node tree back to raw HTML (spaces are ignored):

```elixir
Floki.find(html, ".example")
|> Flok.raw_html
# => <div class="example"></div>
```

To fetch some attribute from elements, try:

```elixir
Expand Down
44 changes: 44 additions & 0 deletions lib/floki.ex
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,50 @@ defmodule Floki do
Parser.parse(html)
end

@self_closing_tags ["area", "base", "br", "col", "command", "embed", "hr", "img", "input", "keygen", "link", "mete", "param", "source", "track", "wbr"]

@doc """
Converts node tree to raw HTML (spaces are ignored).

## Examples

iex> Floki.parse(~s(<div class="wrapper">my content</div>)) |> Floki.raw_html
~s(<div class="wrapper">my content</div>)

"""

def raw_html(tuple) when is_tuple(tuple), do: raw_html([tuple])
def raw_html([], html), do: html
def raw_html([value|tail], html) when is_bitstring(value), do: value
def raw_html([first_dom|tail], html \\ "") do
elem = first_dom |> elem(0)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use pattern matching to get the values, like

{elem, attrs, value} = first_dom

attrs = first_dom |> elem(1) |> tag_attrs
value = first_dom |> elem(2)
raw_html(tail, html <> tag_for(elem, attrs, value))
end

def tag_attrs(attr_list) do
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function could be private.

attr_list
|> Enum.reduce("", fn(c,t) -> ~s(#{t} #{elem(c,0)}="#{elem(c,1)}") end)
|> String.strip
end

def tag_for(elem, attrs, value) do
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function could be private as well.

if Enum.member?(@self_closing_tags, elem) do
if attrs != "" do
"<#{elem} #{attrs}/>"
else
"<#{elem}/>"
end
else
if attrs != "" do
"<#{elem} #{attrs}>#{raw_html(value)}</#{elem}>"
else
"<#{elem}>#{raw_html(value)}</#{elem}>"
end
end
end

@doc """
Find elements inside a HTML tree or string.

Expand Down
48 changes: 48 additions & 0 deletions test/floki_test.exs
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,19 @@ defmodule FlokiTest do
}
end

@basic_html """
<div id="content">
<p>
<a href="uol.com.br" class="bar">
<span>UOL</span>
<img src="foo.png"/>
</a>
</p>
<strong>ok</strong>
<br/>
</div>
"""

# Floki.parse/1

test "parse html_without_html_tag" do
Expand All @@ -119,6 +132,41 @@ defmodule FlokiTest do
]
end

# Floki.raw_html/2

test "raw_html" do
raw_html = Floki.parse(@basic_html) |> Floki.raw_html
assert raw_html == String.split(@basic_html, "\n") |> Enum.map(&(String.strip(&1))) |> Enum.join("")
end

test "raw_html (html with data attributes)" do
raw_html = Floki.parse(@html_with_data_attributes) |> Floki.raw_html
assert raw_html == String.split(raw_html, "\n") |> Enum.map(&(String.strip(&1))) |> Enum.join("")
end

test "raw_html (after find)" do
raw_html = Floki.parse(@basic_html) |> Floki.find("a") |> Floki.raw_html
assert raw_html == ~s(<a href="uol.com.br" class="bar"><span>UOL</span><img src="foo.png"/></a>)
end

# Floki.tag_attrs/1
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I suggested to change those functions to private, you can remove the tests below. Because these functions are implementation details of the raw_html/2, therefore it don't need to be tested again.


test "tag attrs" do
fake = ~s(href="http://elixir-lang.org" target="_blank" class="btn")
assert fake == Floki.tag_attrs([{"href", "http://elixir-lang.org"}, {"target", "_blank"}, {"class", "btn"}])
end

test "empty tag attrs" do
assert "", Floki.tag_attrs([])
end

# Floki.tag_for/3

test "tag_for" do
tag = Floki.tag_for "a", ~s(href="http://elixir-lang.org" target="_blank"), [{"img", [{"src", "foo.png"}], []}]
assert tag == ~s(<a href="http://elixir-lang.org" target="_blank"><img src="foo.png"/></a>)
end

# Floki.find/2 - Classes

test "find elements with a given class" do
Expand Down